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CONSERVED AND SPECIFIC STREPTOCOCCAL GENOMES 

FIELD OF THE INVENTION 

The invention relates to polynucleotides which are conserved or specific to 
one or more species of Streptococcus, Streptococcus species serotypes, and/or 
serotype isolates. The conserved or specific genomic regions can be used to identify, 
screen and develop vaccines and other treatments for Streptococcal infections and can 
be used in diagnostic assays to diagnose and identify Streptococcal infections. 

BACKGROUND OF THE INVENTION 

The genus Streptococcus consists of Gram-positive, chain-forming, spherical 
bacterial cells. Three species of clinical interest are ^.pneumoniae ("pneumococcus" 
or "S.pn.")» S.pyogenes ('group A streptococcus' or 'GAS*) and S.agalactiae ('group 
B streptococcus' or *GBS'). Infections with these three pathogenic streptococci lead 
to conditions including pharyngitis, toxic shock syndrome and necrotizing fasciitis. 

Once thought to infect only cows, GBS is now known to cause serious disease, 
bacteraemia and meningitis in immunocompromised individuals and neonates. There 
are two known types of neonatal infection. The first (early onset, usually within 5 
days of birth) is manifested by bacteraemia and infection. It is generally contracted 
vertically as a baby passes through the birth canal. GBS is thought to colonize the 
vagina of about 25% of young women; approximately 1% of infants born via a 
vaginal birth to colonised mothers will become infected. Mortality resulting from 
these infections is between 50 - 70%. The second type of neonatal infection is a 
meningitis that occurs 10 to 60 days after birth. If pregnant women are vaccinated 
with type m capsule so that the infants are passively immunised, the incidence of the 
late onset meningitis is generally reduced, although not entirely eliminated. 

The "B" in "GBS" refers to the Lancefield classification, which is based on 
the antigenicity of a carbohydrate which is soluble in dilute acid and called the C 
carbohydrate. Lancefield identified 13 types of C carbohydrate, designated A to O, 
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that could be serologically differentiated. The organisms that most commonly infect 
humans are found in groups A, B, D, and G. Within group B, strains can be divided 
into at least 9 serotypes (la, lb, n, m, IV, V, VI, VH, and VIII) based on the structure 
of their polysaccharide capsule. Further categories based on, for example, the 
expression of certain proteins have also been developed. 

GBS strains of polysaccharide capsule Type V were rarely isolated before the 
mid-1980 , s but now account for approximately one-third of clinical isolates in the US. 
Type V is the most common capsular serotype associated with invasive infection in 
nonpregnant adults, and the emergence of Type V strain over the past decade has been 
temporarily linked to an increase in GBS disease in this population. 

Group A streptococcus is a frequent human pathogen, estimated to be present 
in between 5 - 15% of normal individuals without signs of disease. When host 
defences are compromised, or when the organism is able to exert its virulence, or 
when it is introduced into vulnerable tissues or hosts, however, an acute infection 
occurs. Diseases include puerperal fever, scarlet fever, erysipelas, pharyngitis, 
impetigo, necrotising fasciitis, myositis and streptococcal toxic shock syndrome. 

Pneumococcus is the most common cause of acute respiratory infection and 
otitis media and is estimated to result in over 3 million deaths in children every year 
worldwide from pneumonia, bacteremia, or meningitis. Even more deaths occur 
among elderly people, among whom S. pn. is the leading cause of community- 
acquired pneumonia and meningitis. Since 1990, the number of penicillin-resistant 
strains has increased from 1 to 5% to 25 to 80% of isolates, and many strains are now 
resistant to commonly prescribed antibiotics such as penicillin, macrolides, and 
fluoroquinolones. See Tettelin, et al. (2001) Science 293, 248-506. 

The complete genomic sequence of a virulent isolate of S. pneumoniae was 
published by Tettelin, et al. (2001) Science 293, 248-506 and is available at the TIGR 
website at http://www.tigr.org . as well as on GEN BANK (available through the Pub 
Med website at http://www.ncbi.nlm.nih. go v/entrez/auerv.fcgQ . The genomic 
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sequence, the Tettelin article and its published supplemental material are incorporated 
herein by reference in their entirety. 

The complete genomic sequence of an Ml strain of S. pyrogenes was 
published by Ferretti, et al. (2001) Proc. Natl Acad. ScL USA 98, 4658 - 4663 and is 
available at the TTGR website at http://www.tigr.org. The genomic sequence, the 
Ferretti article and its published supplemental materials are incorporated herein by 
reference in their entirety. 

The complete genomic sequence of a serotype V strain of S. agalactiae (type V 
strain 2603 V/R) is published on the date of this filing, August 27, 2002 by Gen Bank 
Accession no. AE009948 (available through Pub Med at 

http://www.ncbi.nlm.nih.^ov/entrez/querv.fcgi and/or is available on the same day at 
the HGR website at http://www.tigr.com. Most of this sequence is also availabe in 
PCT International Patent Application Publication WO 02/34771 . The genomic 
sequence, the Tettelin article and its published supplemental materials are 
incorporated herein by reference in their entirety. 

Current treatments for Streptococcal infections include both antibiotics and 
prophylactic vaccination. Current vaccines, particularly with respect to GBS, suffer 
from poor immunogenicity, while the emergence of antibiotic resistant strains has 
lessened the effectiveness of currently used antibiotics. Accordingly, there is an 
increasing need for the development of new vaccines and antibiotics (as well as other 
small molecule bacterial inhibitors) to help prevent and treat Streptococcal infections. 

Applicants have identified regions of the Streptococcal genomes which can be 
used to identify and develop new vaccines and treatments for Streptococcal infections. 
Specifically, Applicants have identified polynucleotides of the Streptococcal genome 
which are conserved or specific to Streptococcal species, species serotypes, and/or 
specific serotype isolates. These polynucleotides and their expressed polypeptides 
can be used to screen, develop and design new vaccines, antibiotics and other small 
molecule bacterial inhibitors. These polynucleotides and their expressed polypeptides 
can further be used to diagnose and identify Steptococcal infections. 
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SUMMARY OF THE INVENTION 

Hie invention relates to polynucleotides which are conserved or specific to 
one or more species of Streptococcus, Streptococcus species serotypes, and/or 
serotype isolates. In particular, the invention relates to polynucleotides from 
Streptococcus which are conserved or specific to one or more of the species of £ 
pneumoniae ("pneumococcus" or "S. pn ."), £ pyogenes ("group A streptococcus" or 
"GAS"), and £ agalactiae ("group B streptococcus" or "GBS")- The invention 
further relates to polynucleotides which are conserved or specific to one or more 
Streptococcal species serotypes, such as GBS serotypes la, lb, H m, IV, V, VI, VII, 
and VIE. The invention still further relates to polynucleotides which are conserved or 
specific to one or more clinical isolates of a Streptococcus species. 

Hie invention is based on the identification of the following Subsets of genes. 
Genes falling within each subset are described with respect to referenced tables, lists, 
and/or figures (in particular the CGH map depicted in Figure 1). 

The following Subsets related to the GBS genome: 

GBS Subset 1: 1060 GBS genes which have homologs with GAS and with 
pneumococcus (Table 8); 

GBS Subset 2: 225 GBS genes which have homologies with GAS, but not 
with pneumococcus (Table 10); 

GBS Subset 3: 176 GBS genes which have homologues with pneumococcus 
but not with GAS (Table 9); 

GBS Subset 4: 683 GBS genes which do not have homologues with GAS or 
pneumococcus (specific to GBS vs GAS and pneumococcus) (Table 11). 

Hie invention is based on the identification of the following subsets of genes 
within the GAS genome: 

GAS Subset 1: 1006 GAS genes which have homologues with GBS and with 
pneumococcus (Table 33); 
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GAS Subset 2: 212 GAS genes which have homologues with GBS but do not 
have homologues with pneumococcus (Table 34); 

GAS Subset 3: 62 GAS genes which have homologues with pneumococcus 
but do not have homologues with GBS (Table 35); 

GAS Subset 4: 416 GAS genes which do not have homologues with either 
GBS or pneumococcus. This Subset can be determined by subtracting the above 
subsets from the published genome. 

The invention is based on the identification of the following subsets of genes 
within the pneumococcus genome: 

Spn Subset 1: 1034 Spn genes which have homologues with GBS and GAS 
(Table 36); 

Spn Subset 2: 195 Spn genes which have homologues with GBS but do not 
have homologues with GAS (Table 37); 

Spn Subset 3: 74 Spn genes which have homologues with GAS but do not 
have homologues with GBS (Table 38); 

Spn Subset 4: 836 Spn genes which do not have homologues with either GBS 
or pneumococcus. This Subset can be determined by substracting the above Subsets 
from the published genome. 

The invention further provides polynucleotides which are conserved or 
specific to Streptococcus based on a comparison with a wide range of published 
bacterial genomes. The following additional Subsets are provided: 

GBS Subset 1(a): Of the 1060 GBS genes which have homologues in both 
GAS and pneumococcus, 12 of those GBS genes do not have homologues with any of 
the other published bacterial genomes at the time of the invention (i.e., GBS Subset 
1(a) is specific to Streptococcus vs non Streptococcus published genomes). (The 12 
GBS ORFs are listed in Table 3). 

GBS Subset 2(a): This Subset comprises GBS genes which have homologues 
with GAS, but not with pneumococcus or any other published bacterial genomes at 
the time of the invention. 
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GBS Subset 3(a): This Subset comprises GBS genes which have homologues 
with pneumococcus, but not with GAS or any other published bacterial genomes at 
the time of the invention. 

GBS Subset 4(a): Of the 683 GBS genes which do not have homologues in 
either GAS or pnuemococcus, 315 of these GBS genes also do not have homologues 
with any of the other published bacterial genomes. These include six proteins 
predicted to be anchored on the cell wall (SAG0677, SAG0771, SAG1052, SAG1331, 
SAG1473, and SAG1168), three of the capsule-related genes (SAG1163, SAG1167, 
and SAG1 168), six transcriptional regulators, and four genes of the cyl operon 
(SAG0663 - SAG0673) essential for GBS hemolytic activity and production of 
pigment. See Pritzlaff et al. (2001) Mot Microbiol, 39, 236 - 247. The rest of the 
315 proteins include 240 hypothetical proteins with no similarity to other proteins in 
databases. 

Many of the 315 genes specific to S. agalactiae are located in regions likely to 
constitute mobile genetic elements. Two of these regions resemble prophages 
(SAG0545-SAG0610 and SAG1835-SAG1885) displaying a mosaic structure with 
segments most similar to different bacteriophages, a pattern that suggests frequent 
recombination events. PblA and PblB are adhesins from a S. mitis prophage where 
they contribute to endocarditis by binding to human platelets (See Sensing, et al. 
(2001) Infect Immwu 69, 6186 - 6192; Bensing, et al (2001) Infect Immun. 69, 1373 
- 1380. Their orthologs in S. agalactiae are located on separate prophages and 
display a different protein structure. Another region (SAG1247-SAG1299) encodes a 
putative conjugative transposon that carries genes for cadmium efflux and mercury 
resistance. 

GAS Subset 1(a): This Subset comprises GAS genes which have homologues 
with GBS and with pneumococcus, but do not have homologues with any of the other 
published bacterial genomes at the time of the invention. 
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GAS Subset 2(a): This Subset comprises GAS genes which have homologues 
with GBS but do not have homologues with pneumococcus or any of the other 
published bacterial genomes at the time of the invention; 

GAS Subset 3(a): This Subset comprises GAS genes which have homologues 
with pneumococcus but do not have homologues with GBS or any of the other 
published bacterial genomes at the time of the invention. 

GAS Subset 4(a): This Subset comprises GAS genes which do not have 
homologues with either GBS or pneumococcus or with any of the other published 
bacterial genomes at the time of the invention. 

Spn Subset 1(a): This Subset comprises Spn genes which have homologues 
with GBS and GAS but which do not have homologues with any of the other 
published bacterial genomes at the time of the invention; 

Spn Subset 2(a): This Subset comprises Spn genes which have homologues 
with GBS but do not have homologues with GAS or with any of the other published 
bacterial genomes at the time of the invention; 

Spn Subset 3(a): This Subset comprises Spn genes which have homologues 
with GAS but do not have homologues with GBS or with any of the other published 
bacterial genomes at the time of the invention; 

Spn Subset 4(a): This Subset comprises Spn genes which do not have 
homologues with either GBS or pneumococcus or with any of the other published 
bacterial genomes at the time of the invention. 

The invention also provides polynucleotides which are conserved or specific 
to GBS serotypes and/or clinical isolates. Applicants have sequenced 19 GBS genes 
from a variety of GBS serotypes in 11 different clinical isolates. The sequences of 
these genes are set forth in Tables 13 - 31. Hie following additional subsets are 
provided: 

GBS Subset 1(b): of the 1060 GBS genes which have homologues with GAS 
and with pneumococcus, 47 of these GBS genes vary among the 11 clinical isolates 
(GBS Subset 1(b)(0). 1013 of these GBS genes are conserved across the 1 1 clinical 
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isolates (GBS Subset l(b)(ii)). These lists can be determined by comparing the genes 
listed in Table 8 with the Comparative Genome Hybridization in Figure 1. 

GBS Subset 2(b): of the 225 GBS genes which have homologies with GAS, 
but not pneumococcus, 44 of these GBS genes vary among the 1 1 clinical isolates 
(GBS Subset 2(b)(0). 181 of these GBS genes are conserved across the 11 clinical 
isolates (GBS Subset 2(b)(ii)). These lists can be determined by comparing the genes 
listed in Table 10 with the Comparative Genome Hybridization in Figure 1. 

GBS Subset 3(b): of the 176 GBS genes which have homologies with 
pneumococcus, 44 of these GBS genes vary among 11 clinical isolates (GBS Subset 
3(b)(0). 132 of these GBS genes are conserved across the 11 clinical isolates (GBS 
Subset 3(b)(ii)). This list can be determined by comparing the genes listed in Table 9 
with the Comparative Genome Hybridization in Figure 1. 

GBS Subset 4(b): of the 683 GBS genes which do not have homologues with 
GAS or pneumococcus, 260 GBS genes vary among the 1 1 clinical isolates (GBS 
Subset 4(b)(0). 423 of these GBS genes are conserved across the 1 1 clinical isolates 
(GBS Subset 4(b)(ii)). This list can be determined by comparing the genes listed in 
Table 1 1 with the Comparative Genome Hybridization in Figure 1. GBS Subset 
4(b)(ii) also includes the GBS ORFs listed on Table 12 receiving a under the 
column "GBS specific". 

The invention further provides polynucleotides which are likely recent 
genomic duplications in GBS. These duplications include glycosyl transferases, 
sortases, proteins anchored on the cell wall, B lactam resistance factors, and many 
hypothetic proteins. The GBS genes are listed in Table 4 (GBS Subset 5). 

The invention is also based on the identification of a cluster of 13 adjacent 
genes (SAG1410 — S AG1424) which is believed to encode enzymes required for 
synthesis of the group B carbohydrate, a coplex multiantennary structure of rhamnose, 
glucitol phosphate, N-acetylglucosamine, and galactose. (GBS Subset 6). Predicted 
proteins encoded within this cluster include seven putative glycoslytransferases, four 
of which are similar to rhamnosyltransferases in other streptococcal species; a 
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putative dTDP-L-rhamnose synthase; and proteins involved in glucitol synthesis. All 
nine regonized GBS capsular polysaccharide types contain sialic acid residues as part 
of their repeating unit structure, a feature that contributes to virulence by inhibitng 
activation of the alternative complement pathway. See Edwards et aL (1982) /. 
Immunol 128, 1278 - 1283. 

The type V capsular polysaccharide gene cluster consists of 18 genes. (GBS 
Subset 6(a)). A region of glycosyltransferases and related proteins (SAG1 162 - 
SAG1 170) that direct the synthesis of the type V polysaccharide repeat unit is flanked 
on either side by genes that are conserved in all known GBS capsule serotypes. 
Downstream of this region are genes that encode enzynmes for the biosynthesis and 
activation of sialic acid (SAG1158 - SAG1161). Upstream of the serotype specific 
region are genes (SAG1 171 - SAG1 175) found not only in all nine GBS capsular 
serotypes but also in a variety of other polysaccharide-producing streptococci. 

The invention is also based on the identification of GBS ORFs predicted to 
encode proteins carrying a signal peptide (GBS Subset 7). These GBS ORFs are 
listed in Table 2 receiving a under the column "signal peptide". 

The invention is also based on the identification of GBS ORFs predicted to 
encode proteins which are anchored on the cell wall through an LPxTG motif (GBS 
Subset 8). These GBS ORFs are listed in Table 2 receiving a 4 V under the column 
"sortase motif. 

The invention is also based on the identification of GBS ORFs prediced to 
encode lipoproteins (GBS Subset 9). These GBS ORF's arc listed in Table 2 
receiving a V under the column 'lipoprotein". 

Hie invention is also based on the identification of two GBS ORF's predicted 
to encode enzymes related to metabolism (GBS Subset 10). These GBS ORFs 
include a putative pullulanase (SAG1216) and a neuraminidase-related protein 
(SAG1932). 
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The invention is also based on the identification of GBS ORF's predicted to 
encode proteins exposed on the cell surface (GBS Subset 11). These GBS ORF's are 
listed in Table 2 receiving a w +" under the column 'TACS". 

The invention is also based on the identification of 401 GBS ORFs from GBS 
strain 2603 V/R which were not detected in at least one other of the 1 1 tested clinical 
isolates (GBS Subset 12). See Comparative Hybridization Genome in Figure 1. 364 
of these 401 ORF's correspond to 15 regions containing more than 5 contiguous 
genes. Each region is identified in Figure 1 by numerical yellow bullets. Each region 
comprises a subset as defined below: 

Region 1: GBS Subset 12(a). This region is unique to GBS (SAG0218 - 
SAG0238). This region is a possible plasmid or remnant of a phage and contains 
mostly hypothetical proteins. 

Region 2: GBS Subset 12(b) 

Region 3: GBS Subset 12(c) 

Region 4: GBS Subset 12(d) 

RegionS: GBS Subset 12(e) 

Region 6: GBS Subset 12(0 

Region 7: GBS Subset 12(g) 

Region 8: GBS Subset 12(h). This region is specific to GBS (SAG1018 - 
SAG1037). This regioncomprises 20 proteins of unknown function, most of which 
are predicted to be membrane associated or secreted, and displays an atypical 
nucleotide composition. 

Region 9: GBS Subset 12(i) 

Region 10: GBS Subset 120) 

Region 11: GBS Subset 12(k) 

Region 12: GBS Subset 120) 

.Region 13: GBS Subset 12(m) 

10 



PATENT APPLICATION 
ATTYREFNO. 19195.002 

Region 14: GBS Subset 12(n). This region is unique to GBS and spans 33 
genes (SAG1989 - 2021), including 25 proteins of unknown function, some of which 
carry a cell-wail anchor. 

Region 15: GBS Subset 12(o). 

This invention is also based on identification of clusters of GBS genes as set 
forth in Figure 5 and Table 6. In Figure 5, the presence of a particular gene or gene 
cluster is indicated in the figure by a red square and the absence of a gene or cluster by a 
black square. The relationship between strains based on this analysis is depicted by the tree at 
the top of the figure. The strains and their serotypes are indicated (NT: nontypeable). 
Ousters with identical profiles are reduced to a single horizontal line and the number of genes 
in each cluster is indicated on the right The clusters of 5 or more genes, labeled in red text 
and numbered, are listed in Table 6. The 1698 genes shared by all 19 strains are labeled in 
green text. Applicants identified the following subsets: 

GBS Subset 13 (a): Cluster 1 (from Table 6). 

GBS Subset 13 (b): Cluster 2 (from Table 6). 

GBS Subset 13 (c): Cluster 3 (from Table 6). 

GBS Subset 13 (d): Cluster 4 (from Table 6). 

GBS Subset 13 (e): Cluster 5 (from Table 6). 

GBS Subset 13 (f): Cluster 6 (from Table 6). 

GBS Subset 13 (g): Cluster 7 (from Table 6). 

GBS Subset 13 (h): Cluster 8 (from Table 6). 

GBS Subset 13 (i): Cluster 9 (from Table 6). 

GBS Subset 13 (j): Cluster 10 (from Table 6). 

GBS Subset 13 (k): Cluster 1 1 (from Table 6). 

GBS Subset 13 (1): Cluster 12 (from Table 6). 

GBS Subset 13 (m): Cluster 13 (from Table 6). 

GBS Subset 13 (n): Cluster 14 (from Table 6). 

GBS Subset 13 (o): Cluster IS (from Table 6). 

GBS Subset 13 (p): Cluster 16 (from Table 6). 

GBS Subset 13 (q): 1698 ORFs shared by all strains. 
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The invention is also based on the identification of the polynucleotide 
sequences of 19 genes from 11 different GBS strains. The 19 genes are listed on 
Table 7. A further GBS Subset 14 includes this set of polynucleotide sequences from 
the 11 strains and their encoded polypeptide sequences. In particular, GBS Subset 14 
contains a Subset of polynucleotide fragments of 10 or more contiguous 
polynucleotides which are conserved between two or more strains (GBS Subset 
14(a)). GBS Subset 14 further includes a Subset of polynucleotide fragments of 15 or 
more contiguous polynucleotides which are conserved between two or more strains 
(GBS Subset 14(b)). GBS Subset 14 further includes a Subset of polynucleotide 
fragments of 10 or more contiguous polynucleotides which are conserved between 
three or more strains (GBS Subset 14(c)). GBS Subset 14 further includes a Subset 
of polynucleotide fragments of 10 or more contiguous polynucleotides which are 
conserved between four or more strains (GBS Subset 14(d)). 

GBS Subset 14 further includes a Subset of polypeptide fragments of 5 or 
more contiguous amino acids which are conserved between in two or more strains 
(GBS Subset 14(e)). GBS Subset 14 further includes a Subset of polypeptide 
fragments of 5 or more contigous amino acids which are conserved between three or 
more strains (GBS Subset 14(f)). GBS Subset 14 further includes a Subset of 
polypeptide fragments of 5 or more contiguous amino acids which are conserved 
between four or more strains (GBS Subset 14(g)). GBS Subset 14 further includes a 
Subset of polypeptide fragments of 10 or more contiguous amino acids which are 
conserved across two or more strains (GBS Subset 14(h)). 

The invention provides for methods of screening a Streptococcal genome for a 
conserved or a specific genomic sequence using one or more of the Subsets of the 
invention. 

The invention further provides for an immunogenic composition comprising a 
polypeptide expressed by one or more of the polynucleotides in one or more of the 
Subsets of the invention, and methods for designing an immunogenic composition by 
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selecting one or more polypeptides expressed by one or more of . the polynucleotides 
in one or more of the Subsets of the invention. 

The invention further provides for methods of screening compounds for 
activity against a Streptococcal bacteria, which method comprises contacting the 
compounds with a polypeptide expressed by the polynucleotide from one of the 
Subsets of the invention. 

The invention further provides for compositions comprising one or more of 
the polynucleotides, and fragments thereof, selected from the group consisting of the 
sequences set forth in Tables 13-31. 

The invention further provides for compositions comprising polypeptides and 
fragments thereof encoded by the polynucleotides set forth in Tables 13-31. 

BRIEF DESCRIPTION OF THE TABLES AND DRAWINGS 
Table 1 comprises a complete list of GBS predicted genes, listed by SAGxxxx 
ORF number. The SAGxxxx ORF number corresponds to the genomic sequence for 
the Streptococcus agalactiae type V strain 2603 V/R available either at the TTGR 
website on the date of this filing at http://www.tigr.org or at the GenBank database at 
accession number AE009948. This table also includes the predicted amino acid size 
of the predicted expressed protein and the predicted function, if known. 

Table 2 comprises a list of predicted and experimentally characterized surface 
and secreted proteins from GBS. The SAGxxxx ORF number corresponds to the 
genomic sequence for the Streptococcus agalactiae type V strain 2603 V/R available 
either at the TTGR website on the date of this filing at http://www.tigr.org or at the 
GenBank database at accession number AE009948. 

Table 3 lists GBS genes which were shared among GBS, GAS and 
pneumococcus, but which were not found in any of the other completely sequenced 
genomes. Hie SAGxxxx ORF number corresponds to the genomic sequence for the 
Streptococcus agalactiae type V strain 2603 V/R available either at the TTGR website 
on the date of this filing at http://www.tigr.org or at the GenBank database at 
accession number AE009948. 
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Table 4 depicts GBS genes which are predicted to have been recently 
duplicated within the genome. The S AGxxxx ORF number corresponds to the 
genomic sequence for the Streptococcus agalactiae type V strain 2603 V/R available 
either at the T1GR website on the date of this filing at http://www.tigr.org or at the 
GenBank database at accession number AE009948. 

Table 5 lists the 19 GBS strains used for comparative genome hybridisations 
and phylogenetic analysis. 

Table 6 lists clusters of GBS genes derived from phylogenetic profiling of 
GBS strains based on comparative genome hybridisations. The S AGxxxx ORF 
number corresponds to the genomic sequence for the Streptococcus agalactiae type V 
strain 2603 V/R available either at the TIGR website on the date of this filing at 
http://www.tigr.org or at the GenBank database at accession number AE009948. 

Table 7 lists the GBS genes used for phylogenetic analyses of the 19 GBS 
strains. The SAGxxxx ORF number corresponds to the genomic sequence for the 
Streptococcus agalactiae type V strain 2603 V/R available either at the TIGR website 
on the date of this filing at http://www.tigr.org or at the GenBank database at 
accession number AE009948. 

Table 8 lists the 1060 GBS ORF's which are shared with GAS and 
pneumococcus. The ORFxxxxx reference number can be translated to SAGxxxx 
ORF number by using Table 32. The SAGxxxx ORF number corresponds to the 
genomic sequence for the Streptococcus agalactiae type V strain 2603 V/R available 
either at the TIGR website on the date of this filing at http://www.tigr.org or at the 
GenBank database at accession number AE009948. 

Table 9 lists die 176 GBS ORF's which are shared with pneumococcus but 
which are not homologous to a GAS gene. The ORFxxxxx reference number can be 
translated to SAGxxxx ORF number by using Table 32. The SAGxxxx ORF number 
corresponds to the genomic sequence for the Streptococcus agalactiae type V strain 
2603 V/R available either at the TIGR website on the date of this filing at 
http://www.tigr.org or at the GenBank database at accession number AE009948. 
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j a ble 10 lists the 225 GBS ORFs which are shared with GAS but which are 
not homologous with a pnuemococcus gene. The ORFxxxxx reference number can 
be translated to SAGxxxx ORF number by using Table 32. The SAGxxxx ORF 
number corresponds to the genomic sequence for the Streptococcus agalactiae type V 
strain 2603 V/R available either at the TIGR website on the date of this filing at 
htt p://www.tigr.org or at the GenBank database at accession number AE009948. 

Table 11 lists 683 GBS ORFs which are not shared with either GAS or 
pneumococcus. The ORFxxxxx reference number can be translated to SAGxxxx 
ORF number by using Table 32. The SAGxxxx ORF number corresponds to the 
genomic sequence for the Streptococcus agalactiae type V strain 2603 V/R available 
either at the TIGR website on the date of this filing at http://www.tisr.oyfi or at the 
GenBank database at accession number AE009948. 

Table 12 lists 315 GBS ORFs which are not shared with GAS, pneumococcus 
or any other published genomic sequence. The ORFxxxxx reference number can be 
translated to SAGxxxx ORF number by using Table 32. The SAGxxxx ORF number 
corresponds to the genomic sequence for the Streptococcus agalactiae type V strain 
2603 V/R available either at the TIGR website on the date of this filing at 
htt p://www.tigr.org or at the GenBank database at accession number AE009948. 

Table 13 lists the polynucleotide sequences of the 1 1 strains relating to GBS 
ORF S AG0466. An alignment of each of the sequences is also included. 

Table 14 lists the polynucleotide sequences of the 11 strains relating to GBS 
ORF SAG0471. An alignment of each of the sequences is also included. 

Table 15 lists the polynucleotide sequences of the 1 1 strains relating to GBS 
ORF SAG0492. An alignment of each of the sequences is also included 

Table 16 lists the polynucleotide sequences of the 1 1 strains relating to GBS 
ORF SAG0767. An alignment of each of the sequences is also included. 

Table 17 lists the polynucleotide sequences of the 1 1 strains relating to GBS 
ORF SAG1086. An alignment of each of the sequences is also included. 
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Table 18 lists the polynucleotide sequences of the 1 1 strains relating to GBS 
ORF SAG1600. An alignment of each of the sequences is also included. 

Table 19 lists the polynucleotide sequences of the 1 1 strains relating to GBS 
ORF SAG1680. An alignment of each of the sequences is also included. 

Table 20 lists the polynucleotide sequences of the 1 1 strains relating to GBS 
ORF S AG1723. An alignment of each of the sequences is also included. 

Table 21 lists the polynucleotide and polypeptide sequences of the 1 1 strains 
relating to GBS ORF SAG0079. An alignment of each of the sequences is also 
included. 

Table 22 lists the polynucleotide and polypeptide sequences of the 11 strains 
relating to GBS ORF SAG0093. An alignment of each of the sequences is also 
included. 

Table 23 lists the polynucleotide and polypeptide sequences of the 11 strains 
relating to GBS ORF SAG0163. An alignment of each of the sequences is also 
included. 

Table 24 lists the polynucleotide and polypeptide sequences of the 1 1 strains 
relating to GBS ORF SAG0290. An alignment of each of the sequences is also 
included. 

Table 25 lists the polynucleotide and polypeptide sequences of the 1 1 strains 
relating to GBS ORF SAG0368. An alignment of each of the sequences is also 
included. 

Table 26 lists the polynucleotide and polypeptide sequences of the 1 1 strains 
relating to GBS ORF SAG0503. An alignment of each of the sequences is also 
included. 

Table 27 lists the polynucleotide and polypeptide sequences of the 1 1 strains 
relating to GBS ORF SAG1473. An alignment of each of the sequences is also 
included. 
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Table 28 lists the polynucleotide and polypeptide sequences of the 1 1 strains 
relating to GBS ORF SAG1S52. An alignment of each of the sequences is also 
included. 

Table 29 lists the polynucleotide and polypeptide sequences of the 11 strains 
relating to GBS ORF SAG1641 . An alignment of each of the sequences is also 
included. 

Table 30 lists the polynucleotide and polypeptide sequences of the 1 1 strains 
relating to GBS ORF SAG2147. An alignment of each of the sequences is also 
included. 

Table 31 lists the polynucleotide and polypeptide sequences of the 11 strains 
relating to GBS ORF SAG2148. An alignment of each of the sequences is also 
included. 

Table 32 provides a conversion table for the ORFxxxx reference numbers to 
the SAGxxxx reference numbers. The S AGxxxx ORF number corresponds to the 
genomic sequence for the Streptococcus agalactiae type V strain 2603 V/R available 
either at the TIGR website on the date of this filing at http://www.tigr.org or at the 
GenBank database at accession number AE009948. 

Table 33 lists the 1006 GAS ORFs which are shared with GBS and Spn. The 
sequences corresponding to these ORFs were published in GenBank, Accession No. 
AAK33146 (protein sequence). A link to the corresponding polynucleotide sequence 
is also available. The numbers for the GAS ORF refer directly to their GenBank 
entries. 

Table 34 lists the 212 GAS ORF's which are shared with GBS but which do 
not have homologues with pneumococcus. The sequences corresponding to these 
ORFs were published in GenBank, Accession No. AAK33146 (protein sequence). A 
link to the corresponding polynucleotide sequence is also available. The numbers for 
the GAS ORF refer directly to their GenBank entries. 

Table 35 lists the 62 GAS ORF's which have homologues with pneumococcus 
but which do not have homologues with GBS. The sequences corresponding to these 
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ORFs were published in GenBank, Accession No. AAK33146 (protein sequence). A 
link to the corresponding polynucleotide sequence is also available. The numbers for 
the GAS ORF refer directly to their GenBank entries. 

Table 36 lists the 1034 Spn ORFs which are shared with GBS and GAS. 
These ORFs were published in GenBank. The numbers for Spn correspond to the 
entry for AE005672. 

Table 37 lists the 195 Spn ORFs which are shared with GBS but do not have 
homologues with GAS. These ORFs were published in GenBank. The numbers for 
Spn correspond to the entry for AE005672. 

Table 38 lists the 74 Spn ORFs which are shared with GAS but do not have 
homologues with GBS. These ORFs were published in GenBank. The numbers for 
Spn correspond to the entry for AE005672. 

Figure 1 is a circular representation of the GBS genome and comparative 
hybridisations using microarrays. 

Figure 2 is a schematic representation of in silico comparisons between 
streptococci. 

Figure 3 depicts a phylogenetic tree of GBS strains based on PCR sequences. 
Figure 4 depicts a linear representation of the GBS genome. 
FieureS demonstrates phylogenetic profiling of GBS strains based on 
comparative genome hybridisations. 

BRIEF DESCRIPTION OF THE SEQUENCE ID NOS. 
The following SEQ ID NOS are used in the application and figures. 
SEQ ID NOS. 1301 - 1316 represent the polynucleotide sequences 
corresponding to the SAG0466 ORF (thiolase) in the GBS strains indicated for each 
sequence, including where indicated reverse complements. 

SEQ ID NOS. 1401 - 1417 represent the polynucleotide sequences 
corresponding to the SAG0471 ORF (glucokinase) in the GBS strains indicated for 
each sequence, including where indicated reverse complements. 
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SEQ ID NOS. 1501 - 1511 represent the polynucleotide sequences 
corresponding to the SAG0492 ORF (amino acid ABC transporter, ATP-binding 
protein) in the GBS strains indicated for each sequence, including where indicated 
reverse complements. 

SEQ ID NOS, 1601 - 1617 represent the polynucleotide sequences 
corresponding to the SAG0767 ORF (D-alanine - D-alanine ligase) in the GBS strains 
indicated for each sequence, including where indicated reverse complements. 

SEQ ID NOS. 1701 - 1711 represent the polynucleotide sequences 
corresponding to the SAG1086 ORF (xanthine phosphoribosyltransferase) in the GBS 
strains indicated for each sequence, including where indicated reverse complements. 

SEQ ID NOS. 1801 - 1814 represent the polynucleotide sequences 
corresponding to the SAG1600 ORF (glutamate racemase) in the GBS strains 
indicated for each sequence, including where indicated reverse complements. 

SEQ ID NOS. 1901 - 1914 represent the polynucleotide sequences 
corresponding to the SAG1680 ORF (shikimate 5-dehydrogenase) in the GBS strains 
indicated for each sequence, including where indicated reverse complements. 

SEQ ID NOS. 2001 - 2010 represent the polynucleotide sequences 
corresponding to the SAG1723 ORF (signal peptidase I) in the GBS strains indicated 
for each sequence, including where indicated reverse complements. 

SEQ ID NOS. 2101 - 2112 represent the polynucleotide sequences 
corresponding to the SAG0079 ORF (adenylate kinase) in the GBS strains indicated 
for each sequence, including where indicated reverse complements. 

SEQ ID NOS. 2201 -r 221 1 represent the polynucleotide sequences 
corresponding to the SAG0093 ORF (D-alanyl-D-alanine carboxypeptidase family 
protein) in the GBS strains indicated for each sequence, including where indicated 
reverse complements. 

SEQ ID NOS. 2301 - 2311 represent the polynucleotide sequences 
corresponding to the SAG0163 ORF (competence protein CglA) in the GBS strains 
indicated for each sequence, including where indicated reverse complements. 
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SEQ ID NOS . 2401 - 2410 represent the polynucleotide sequences 
corresponding to the S AG0290 ORF (ABC transporter, substrate-binding protein) in 
the GBS strains indicated for each sequence, including where indicated reverse 
complements. 

SEQ ID NOS. 2501 - 2511 represent the polynucleotide sequences 
corresponding to the SAG0368 ORF (protein of unknown function) in the GBS strains 
indicated for each sequence, including where indicated reverse complements. 

SEQ ID NOS. 2601 - 2609 represent the polynucleotide sequences 
corresponding to the SAG0503 ORF (lipase/acylhydrolase) in the GBS strains 
indicated for each sequence, including where indicated reverse complements. 

SEQ ID NOS. 2701 - 2711 represent the polynucleotide sequences 
corresponding to the SAG1473 ORF (cell wall surface anchor family protein) in the 
GBS strains indicated for each sequence, including where indicated reverse 
complements. 

SEQ ID NOS. 2801 - 281 1 represent the polynucleotide sequences 
corresponding to the SAG1552 ORF (conserved hypothetical protein) in the GBS 
strains indicated for each sequence, including where indicated reverse complements. 

SEQ ID NOS. 2901 - 291 1 represent the polynucleotide sequences 
corresponding to the SAG1641 ORF (YaeC family protein) in the GBS strains 
indicated for each sequence, including where indicated reverse complements. 

SEQ ID NOS. 3001 - 3010 represent the polynucleotide sequences 
corresponding to the SAG2147 ORF (protein of unknown function / lipoprotein, 
putative) in the GBS strains indicated for each sequence, including where indicated 
reverse complements. 

SEQ ID NOS. 3101 - 3111 represent the polynucleotide sequences 
corresponding to the SAG2148 ORF (LysM domain protein) in the GBS strains 
indicated for each sequence, including where indicated reverse complements. 

20 



PATENT APPLICATION 
ATTYREFNO. 19195.002 

DETAILED DESCRIPTION OF THE INVENTION 

The invention relates to polynucleotides which are conserved or specific to 
one or njore species of Streptococcus, Streptococcus species serotypes, and/or 
serotype isolates. In particular, the invention relates to polynucleotides from 
Streptococcus which are conserved or specific to one or more of the species of S. 
pneumoniae ("pneumococcus" or "S. pn."), S. pyogenes ("group A streptococcus" or 
"GAS"), and S. agalactiae ("group B streptococcus" or "GBS"). The invention 
further relates to polynucleotides which are conserved or specific to one or more 
Streptococcal species serotypes, such as GBS serotypes la, lb, n, HI, IV, V, VI, VII, 
and Vm. The invention still fiirther relates to polynucleotides which are conserved or 
specific to one or more clinical isolates of a Streptococcus species. 

In order to facilitate an understanding of the invention, selected terms used in 
the application will be discussed below. 

As used herein, the phrase " species of S treptococcus" generally refers to 
species of the Streptoccus family, including S.pneumoniae ("pneumococcus" or 
"S.pn."), S.pyogenes ('group A streptococcus' or 'GAS') and S.agalactiae ('group B 
streptococcus* or 'GBS*). 

As used herein, the phrase " Streptococcus species serotypes" generally refers 
to subdivisions based on a distinguishing characteristic within a specific 
Streptococcus species. The distinguishing characteristic can be identified by any of a 
wide range of diagnostic tools. For instance, GBS is generally recognized as 
comprising at least nine subdividing serotypes based on the structure of their 
polysaccharide capsule. 

As used herein, the phrases "seroty pe isolates " or "clinical isolates" generally 
refer to specific isolated bacterial strains of a specific Streptococcal species and 
serotype. 

As used herein in reference to bacterial genomes, the phrases "conserved* 1 or 
" shared" generally refer to genomic sequences which have homologues in the two or 
more genomes in the reference. Homology references, as used in this application, ar< 
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based on comparisons using FASTA3. See Pearson (2000)Methods Mol Biol 132 
185- 219. When the homology reference involves a comparison between genes in 
GBS, GAS or Spn, homologous or shared genes are defined by using a FASTA3 P 
value cutoff of 10" 15 . Where the homology reference involves a comparison between 
GBS, GAS or Spn and all other completely sequenced genomes, homologous or 
shared genes are defined by using a FASTA3 P value cutoff of 10 s or lower. 

As used herein in reference to bacterial genomes, the phrases "specific to" or 
"not shared" generally refer to genomic sequences which do not have homologies in 
the two or more genomes in the reference. 

Other software programs to compare identity between nucleotide sequences 
are known in the art, for example those described in section 7.7.18 of Current 
Protocols in Molecular Biology (F.M. Ausubel etal, eds., 1987) Supplement 30. A 
preferred alignment program is GCG Gap (Genetics Computer Group, Wisconsin, 
Suite Version 10.1), preferably using default parameters, which are as follows: open 
gap = 3; extend gap = L 

Sequences within a Subset of the invention include sequences which hybridize 
to the listed genes. Hybridization reactions can be performed under conditions of 
different "stringency". Conditions that increase stringency of a hybridization reaction 
of widely known and published in the art [e.g. page 7.52 of Sambrook et al (1989) 
Molecular Cloning: A Laboratory Manual NY, Cold Spring Harbor Laboratory], 
Examples of relevant conditions include (in order of increasing stringency): 
incubation temperatures of 25°C, 37°C, 50°C, 55°C and 68°C; buffer concentrations 
of 10 x SSC, 6 x SSC, 1 x SSC, 0.1 x SSC (where SSC is 0.15 M NaCl and 15 mM 
citrate buffer) and their equivalents using other buffer systems; formamide 
concentrations of 0%, 25%, 50%, and 75%; incubation times from 5 minutes to 24 
hours; 1, 2, or more washing steps; wash incubation times of 1, 2, or 15 minutes; and 
wash solutions of 6 x SSC, 1 x SSC, 0.1 x SSC, or de-ionized water. Hybridization 
techniques and their optimization are well known in the art [e.g. see Sambrook et al; 
RNA Methodologies (Farrell, 1998) (Academic Press; ISBN 0-12-249695-7); Current 
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Protocols in Molecular Biology (F.M. Ausubei et al. 9 eds., 1987) Supplement 30; 
Short protocols in molecular biology (4th edition, 1999) Ausubei et al eds. ISBN 0- 
471-32938-X; US patent 5,707,829 etc.]. 

Identity between polypeptide sequences can be determined using software 
programs known in the art, for example those described in section 7.7.18 of Current 
Protocols in Molecular Biology (F.M. Ausubei et al„ eds., 1987) Supplement 30. A 
preferred alignment is determined by the Smith-Waterman homology search 
algorithm [Smith & Waterman (1981) Adv. Appl. Math. 2: 482-489.] using an affme 
gap search with a gap open penalty of 12 and a gap extension penalty of 2, BLOSUM 
matrix 62. 

Typically, 50% identity or more between two proteins may be considered to 
be an indication of functional equivalence. References to a percentage sequence 
identity between two amino acid sequences means that, when aligned, that percentage 
of amino acids are the same in comparing the two sequences. 

The terms "polypeptide" - " protein" and "amino acid sequence" as used herein 
generally refer to a polymer of amino acid residues and are not limited to a minimum 
length of the product. Thus, peptides, oligopeptides, dimers, mulimers, and the like, 
are included within the definition. Both full-length proteins and fragments thereof are 
encompassed by the definition. Minimum fragments of polypeptides useful in the 
invention can be at least 3, 4, 5, 6, 7, 8, 9, 10, 1 1, 12, 13, 14, or even 15 amino acids. 
Typically, polypeptides useful in this inverition can have a maximum length suitable 
for the intended application. Generally, the maximum length is not critical and can 
easily be selected by one skilled in the art 

Reference to polypeptides and the like also includes derivatives of the amino 
acid sequences of the invention. Such derivatives can include postexpression 
modifications of the polypeptide, for example, glycosylation, acetylation, 
phosphorylation, and the like. Amino acid derivatives can also include modifications 
to the native sequence, such as deletions, additions and substitutions (generally 
conservative in nature), so long as the protein maintains the desired activity. These 
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modifications may be deliberate, as through site-directed mutagenesis, or may be 
accidental, such as through mutations of hosts which produce the proteins or errors 
due to PCR amplification. Furthermore, modifications may be made that have one or 
more of the following effects: reducing toxicity; facilitating cell processing (e.g., 
secretion, antigen presentation, etc.); and facilitating presentation to B-cells and/or T- 
cells. 

A "recombinant" protein is a protein which has been prepared by recombinant 
DNA techniques as described herein. In general, the gene of interest is cloned and 
then expressed in transformed organisms, as described further below. The host 
organism expressed the foreign gene to produce the protein under expression 
conditions. The polypeptides of the invention may be prepared by recombinant 
means. 

The term " polynucleotide ", as known in the art, generally refers to a nucleic 
acid molecule. A "polynucleotide" can include both double- and single-stranded 
sequences and refers to, but is not limited to, cDNA from viral, prokaryotic or 
eukaryotic MRNA, genomic RNA and DNA sequences from viral (e.g. RNA and 
DNA viruses and retroviruses) or prokaryotic DNA, and especially synthetic DNA 
sequences. Hie term also captures sequences that include any of the known base 
analogs of DNA and RNA, and includes modifications such as deletions, additions 
and substitutions (generally conservative in nature), to the native sequence, so long as 
the nucleic acid molecule encodes a therapeutic or antigenic protein. These 
modifications may be deliberate, as through site-directed mutagenesis, or may be 
accidental, such as through mutations of hosts that produce the antigens. 
Modifications of polynucleotides may have any number of effects including, for 
example, facilitating expression of the polypeptide product in a host cell. 
The term "polynucleotide" further includes DNA, RNA, DNA/RNA hybrids, DNA 
and RNA analogues such as those containing modified backbones (with modifications 
in the sugar and/or phosphates e.g. phosphorothioates, phosphoramidites etc.), and 
also peptide nucleic acids (PNA) and any other polymer comprising purine and 
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pyrimidine bases or other natural, chemically or biochemically modified, non-natural, 
or derivatized nucleotide bases etc. Nucleic acid according to the invention can be 
prepared in many ways (e.g. by chemical synthesis, from genomic or cDNA libraries, 
from the organism itself etc.) and can take various forms (e.g. single stranded, double 
stranded, vectors, probes etc*). 

A polynucleotide can encode a biologically active (e.g., immunogenic or 
therapeutic) protein or polypeptide. Depending on the nature of the polypeptide 
encoded by the polynucleotide, a polynucleotide can include as little as 10 
nucleotides, e.g., where the polynucleotide encodes an antigen. 

By " isolated" is meant, when referring to a polynucleotide or a polypeptide, 
that the indicated molecule is separate and discrete from the whole organism with 
which the molecule is found in nature or, when the polynucleotide or polypeptide is 
not found in nature, is sufficientiy free of other biological macromolecules so that the 
polynucleotide or polypeptide can be used for its intended purpose. 

"Antibody" as known in the art includes one or more biological moieties that, 
through chemical or physical means, can bind to or associate with an epitope of a 
polypeptide of interest. The antibodies of the invention specifically bind to infectious 
prion conformations. The term "antibody" includes antibodies obtained from both 
polyclonal and monoclonal preparations, as well as the following: hybrid (chimeric) 
antibody molecules (see, for example, Winter et al. (1991) Nature 349: 293-299; and 
U.S. Patent No. 4,816,567; F(ab')2 and F(ab) fragments; F v molecules (non-covalent 
heterodimers, see, for example, Inbar et al. (1972) Proc Natl Acad Sci USA 69:2659- 
2662; and Ehrlich et al. (1980) Biochem 12:4091-4096); single-chain Fv molecules 
(sFv) (see, for example, Huston et al. (1988) Proc Natl Acad Sci USA 85:5897-5883); 
dimeric and trimeric antibody fragment constructs; minibodies (see, e.g., Pack et al. 
(1992) Biochem 31:1579-1584; Cumber et al. (1992) J Immunology 149B: 120-126); 
humanized antibody molecules (see, for example, Riechmann et al. (1988) Nature 
332:323-327; Verhoeyan et al. (1988) Science 232:1534-1536; and U.K. Patent 
Publication No. GB 2,276,169, published 21 September 1994); and, any functional 
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fragments obtained from such molecules, wherein such fragments retain 
immunological binding properties of the parent antibody molecule. The term 
"antibody" further includes antibodies obtained through non-conventional processes, 
such as phage display. 

As used herein, the term "monoclonal antibody" refers to an antibody 
composition having a homogeneous antibody population. The term is not limited 
regarding the species or source of the antibody, nor is it intended to be limited by the 
manner in which it is made. Thus, the term encompasses antibodies obtained from 
murine hybridomas, as well as human monoclonal antibodies obtained using human 
rather than murine hybridomas. See, e.g., Cote, et al. Monoclonal Antibodies and 
Cancer Therapy, Alan R. Liss, 1985, p 77. 

An " immunogenic composition " as used herein refers to a composition that 
comprises an antigenic molecule where administration of the composition to a subject 
results in the development in the subject of a humoral and/or a cellular immune 
response to the antigenic molecule of interest. The immunogenicity of the 
composition or the antigenicity of the molecule may be facilitated by the use of an 
adjuvant. 

The practice of the present invention will employ, unless otherwise indicated, 
conventional methods of chemistry, biochemistry, molecular biology, immunology 
and pharmacology, within the skill of the art Such techniques are explained fully in 
the literature. See, e.g., Remington's Pharmaceutical Sciences, 18th Edition (Easton, 
Pennsylvania: Mack Publishing Company, 1990); Methods In Enzymology (S. 
Colowick and N. Kaplan, eds., Academic Press, Inc.); and Handbook of Experimental 
Immunology, Vols. I-IV (D.M. Weir and C.C. Blackwell, eds., 1986, Blackwell 
Scientific Publications); Sambrook, et al., Molecular Cloning: A Laboratory Manual 
(2nd Edition, 1989); Handbook of Surface and Colloidal Chemistry (Birdi, K.S. ed., 
CRC Press, 1997); Short Protocols in Molecular Biology, 4th ed. (Ausubel et al. eds., 
1999, John Wiley & Sons); Molecular Biology Techniques: An Intensive Laboratory 
Course, (Ream et al., eds., 1998, Academic Press); PCR (Introduction to 
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Biotechniques Series), 2nd ecL (Newton & Graham eds., 1997, Springer Verlag); 
Peters and Dalrymple, Fields Virology (2d ed), Fields et al. (eds.), B.N. Raven Press, 
New York, NY. 

It is understood that the antibodies and methods of this invention are not 
limited to particular formulations or process parameters as such may, of course, vary. 
It is also to be understood that the terminology used herein is for the purpose of 
describing particular embodiments of the invention only, and is not intended to be 
limiting. 

All publications, patents and patent applications cited herein are hereby 
incorporated by reference in their entirety. 

Vaccines and Immunisation 

The invention provides an immunogenic composition comprising a 
polypeptide, or a fragment thereof, which is encoded by a polynucleotide sequence 
which is conserved across one or more species of Streptococcus. 

The polynucleotide is preferably conserved across one or more species of 
Streptococcus selected from the group consisting of GBS, GAS and pneumococcus. 
In one embodiment, the polynucleotide is a GBS polynucleotide which is homologous 
with at least one gene from both GAS and pneumococcus. Preferably, the GBS 
polynucleotide is selected from GBS Subset 1, which includes 1060 GBS genes which 
have homologies with both GAS and pneumococcus (Table 8). 

In another embodiment, the polynucleotide is a GAS polynucleotide which is 
homologous with at least one gene from both GBS and pneumococcus. Preferably, 
the GAS polynucleotide is selected from GAS Subset 1, which includes 1006 GAS 
genes which have homologues with both GBS and pneumococcus. 

In another embodiment, the polynucleotide is a pneumococcal polynucleotide 
which is homologous with at least one gene both GAS and GBS. Preferably, the 
pneumococcus polynucleotide is selected from Spn Subset 1, which includes 1034 
pneumococcal genes which have homologous with both GBS and GAS. 
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In another embodiment, the polynucleotide is a GBS polynucleotide which is 
homologous with at least one gene from GAS. Preferably, the polynucleotide is 
selected from one of the genes listed GBS Subset 2, which includes 225 GBS genes 
which have homologues with GAS, but not with pneumococcus. 

In another embodiment, the polynucleotide is a GBS polynucleotide which is 
homologous with at least one gene from pneumococcus. Preferably, the 
polynucleotide is selected from GBS Subset 3, which includes 176 GBS genes which 
have homologues with pneumococcus. 

In another embodiment, the polynucleotide is a GAS polynucleotide which is 
homologous with at least one gene from GBS. Preferably, the polynucleotide is 
selected from GAS Subset 2, which includes 212 GAS genes which have a 
homologue with GBS. 

In another embodiment, the polynucleotide is a GAS polynucleotide which is 
homologous with at least one gene from pneumoccus. Preferably, the polynucleotide 
is selected from GAS Subset 3, which includes 62 GAS genes which have a 
homologue with pneumococcus. 

In another embodiment, the polynucleotide is a pneumococcus polynucleotide 
which is homologous with at least one gene from GBS. Preferably, the 
polynucleotide is selected from Spn Subset 2, which includes 195 Spn genes which 
have a homologue with GBS. 

In another embodiment, the polynucleotide is a pneumococcus polynucleotide 
which is homologous with at least one gene from GAS. Preferably, the 
polynucleotide is selected from Spn Subset 3, which includes 74 Spn genes which 
have a homologue with GAS. 

The invention further provides an immunogenic composition comprising a 
polypeptide, or a fragment thereof, which is encoded by a polynucleotide sequence 
which is specific to one or more species of Streptococcus. 

The invention further provides an immunogenic composition comprising a 
polypeptide, or a fragment thereof, which is encoded by a polynucleotide which is 
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specific to GBS, GAS and pneumococcus. In one embodiment, the polynucleotide is 
a GBS polynucleotide which is homologous to at least one gene from both GAS and 
pneumococcus. Preferably, the GBS polynucleotide is selected from GBS Subset 1. 
In an alternative embodiment, the polynucleotide is a GBS polynucleotide which is 
homologous to at least one gene from both GAS and pneumococcus, but which is not 
homologous to a gene in any other published bacterial genome at the time of the 
invention. Preferably, the GBS polynucleotide is selected from one of the 12 GBS 
genes included in GBS Subset 1(a). (Table 3). 

In another embodiment, the polynucleotide is a GAS polynucleotide which is 
homologous to at least one gene in both GBS and pneumococcus. Preferably, the 
GAS polynucleotide is selected from GAS Subset 1. In another embodiment, the 
polynucleotide is a GAS polynucleotide which is homologous to at least one gene in 
both GBS and pneumococcus but which is not homologous to any gene in any other 
published bacterial genome at the time of the invention. Preferably, the GAS 
polynucleotide is selected from GAS Subset 1(a). 

Alternatively, the polynucleotide is a pneumoccus polynucleotide which is 
homologous to at least one gene in both GBS and GAS. Preferably, the 
pneumococcus polynucleotide is selected from Spn Subset 1(a). In another 
embodiment, the polynucleotide is a pneumoccus polynucleotide which is 
homologous to at least one gene in both GBS and GAS but which does not have a 
homologue in any other published bacterial genome at the time of the invention. 
Preferably, the pneumococcus polynucleotide is selected from Spn Subset 1(a). 

The invention further provides an immunogenic composition comprising a 
polypeptide, or a fragment thereof, which is encoded by a polynucleotide sequence 
which is specific to GBS. In one embodiment, the polynucleotide is a GBS 
polynucleotide which is not homologue to a gene in either GAS or pneumococcus. 
Preferably, the GBS polynucleotide is selected from one of the 683 GBS genes 
included in GBS Subset 4. In a further embodiment, the polynucleotide is a GBS 
polynucleotide which is not homologous to a gene in either GAS or pneumococcus or 
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any other published bacterial genome at the time of the invention. Preferably, the 
GBS polynucleotide is selected from one of the 315 GBS genes in GBS Subset 4(a). 

The invention further provides an immunogenic composition comprising a 
polypeptide, or a fragment thereof, which is encoded by a polynucleotide sequence 
which is specific to GAS. In one embodiment, the polynucleotide is a GAS 
polynucleotide which is not homologous to a gene in either GBS or pneumococcus. 
Preferably, the GBS polynucleotide is selected from one of the 416 GAS genes 
included in GAS Subset 4. In a further embodiment, the polynucleotide is a GAS 
polynucleotide which does not have a homologue in either GBS or pneumococcus or 
in any other published bacterial genome at the time of the invention. Preferably, the 
GAS polynucleotide is selected from GAS Subset 4(a). 

The invention further provides an immunogenic composition comprising a 
polypeptide, or a fragment thereof, which is encoded by a polynucleotide sequence 
which is specific to pneumococcus. In one embodiment, the polynucleotide is a 
pneumococcus polynucleotide which is not homologous to a gene in either GBS or 
GAS. Preferably, the pneumococcus polynucleotide is selected from one of the 836 
Spn genes included in Spn Subset 4. In a further embodiment, the polynucleotide is a 
pneumococcus polynucleotide which does not have a homologue in either GBS or 
GAS or in any other published bacterial genome at the time of the invention. 
Preferably, the pneumococcus polynucleotide is selected from Spn Subset 4(a). 

The invention further provides an immunogenic composition comprising a 
polypeptide, or a fragment thereof, which is encoded by a polynucleotide sequence 
which is specific to GBS and GAS. In one embodiment, the polynucleotide is a GBS 
polynucleotide which is homologous to at least one gene from GAS but is not 
homologous to a gene from pneumococcus. Preferably, the GBS polynucleotide is 
selected from one of the 225 GBS genes included in GBS Subset 2. In another 
embodiment, the GBS polynucleotide is homologous to at least one gene from GAS 
but is not homologous to any gene from pneumococcus and does not have a 
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homologue in any other published bacterial genome at the time of the invention. 
Preferably, the GBS polynucleotide is selected from GBS Subset 2(a). 

In another embodiment, the polynucleotide is a GAS polynucleotide which is 
homologous to at least one gene from GBS but is not homologous to any gene from 
pneumococcus. Preferably, the GAS polynucleotide is selected from one of the 212 
GAS genes included in GAS Subset 2. In another embodiment, the GAS 
polynucleotide is homologous to at least one gene from GBS but is not homologous to 
any gene from pneumococcus and does not have a homologous gene with any other 
published bacterial genome at the time of the invention. Preferably, the GAS 
polynucleotide is a selected from GAS Subset 2(a). 

The invention further provides an immunogenic composition comprising a 
polypeptide, or a fragment thereof, which is encoded by a polynucleotide sequence 
which is specific to GBS and pneumococcus. In one embodiment, the polynucleotide 
is a GBS polynucleotide which is homologous to at least one gene from 
pneumococcus but is not homologous to any gene from GAS. Preferably, the GBS 
polynucleotide is selected from one of the 176 GBS genes included in GBS Subset 3. 
In another embodiment, the polynucleotide is a GBS polynucleotide which is 
homologous with at least one gene from pneumococcus but is not homologous with 
any GAS polynucleotide and does not have a homologous gene in any of the other 
published bacterial genomes at the time of the invention. Preferably, the GBS 
polynucleotide is selected from GBS Subset 3(a). 

In another embodiment, the polynucleotide is a pneumococcus polynucleotide 
which is homologous with at least one gene from GBS, but is not homologous with 
any gene from GAS. Preferably, the pneumoccous polynucleotide is selected from 
one of the 195 Spn genes included in Spn Subset 2. In another embodiment, the 
polynucleotide is a pneumococcus polynucleotide which is homologous with at least 
one gene from GBS, but is not homologous with any gene from GAS and does not 
have a homologous gene in any other published bacterial genome at the time of the * 
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invention. Preferably, the pneumococcus polynucleotide is selected from Spn Subset 
3(a). 

The invention further provides an immunogenic composition comprising a 
polypeptide, or a fragment thereof which is encoded by a polynucleotide sequence 
which is specific to GAS and pneumococcus. In one embodiment the polynucleotide 
is a GAS polynucleotide which is homologous with at least one gene from 
pneumococcus but is not homologous with any gene from GBS. Preferably, the GAS 
polynucleotide is selected from one of the 62 GAS genes included in GAS Subset 3. 
In another embodiment, the polynucleotide is a GAS polynucleotide which is 
homologous with at least one gene from pneumococcus but is not homologous with 
any gene from GBS and is not homologous with any gene of any published bacterial 
genome at the time of the invention. Preferably, the GAS polynucleotide is selected 
from GAS Subset 3(a). 

In another embodiment, the polynucleotide is a pneumococcus polynucleotide 
which is homologous with at least one GAS polynucleotide, but is not homologous 
with any GBS gene. Preferably, the pneumoccous polynucleotide is selected from 
one of the 74 Spn genes included in Spn Subset 3. In another embodiment, the 
polynucleotide is a pneumococcus polynucleotide which is homologous with at least 
one gene from GAS, but is not homologous with any gene from GBS or with a gene 
from any other published bacterial genome at the time of the invention. Preferably, 
the pneumococcus polynucleotide is selected from Spn Subset 3(a). 

The invention further provides an immunogenic composition comprising a 
polypeptide, or a fragment thereof, which is encoded by a polynucleotide sequence 
which is specific to one or more Streptococcal species serotypes. Preferably, the 
polynucleotide is specific to a Streptococcal species serotype selected from the 
Streptococcal species GBS, GAS and pneumococcus. More preferably, the 
polynucleotide is specific to one or more GBS serotypes selected from the group 
consisting of GBS serotype la, lb, II, m, IV, V, VI, VH and Vffl. 
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The invention further provides an immunogenic composition comprising a 
polypeptide, or a fragment thereof, which is encoded by a polynucleotide sequence 
which is conserved across one or more Streptococcal species serotypes. Preferably, 
the polynucleotide is specific to a Streptococcal species serotype selected from the 
Streptococcal species OBS, GAS and pneumococcus. More preferable, the 
polynucleotide is conserved across one or more GBS serotypes selected from the 
group consisting of GBS serotype la, lb, n, m, IV, V, VI, VII and VIE. 

The invention further provides an immunogenic composition comprising a 
polypeptide, or a fragment thereof, which is encoded by a polynucleotide sequence 
which is specific to one or more clinical isolates of a Streptococcal species. 
Preferably, the polynucleotide is specific to a Streptococcal species clinical isolate 
selected from the Streptococcal species GBS, GAS and pneumococcus. More 
preferably, the polynucleotide is specific to one or more GBS clinical isolates selected 
from the clinical isolates identified in Table 5. Still more preferably, the 
polynucleotide is specific to one or more GBS clinical isolates having one or more 
genes selected from the genes listed in Table 7. 

In another embodiment, the polynucleotide is a GBS polynucleotide which is 
homologous to at least one gene from both GAS and pneumococcus and which varies 
among clinical isolates. In another embodiment, the polynucleotide is a GBS 
polynucleotide which is homologous to at least one gene from both GAS and 
pneumococcus and which is homologous with at least one gene from at least one of 
the clinical isolates identified in Table 5. In another embodiment, the polynucleotide 
is a GBS polynucleotide which is homologous to at least one gene from both GAS 
and pneumococcus and which is homologous with at least one gene from each of the 
clinical isolates identified in Table 5. Preferably, the polynucleotide is selected from 
one of the genes listed in Table 7. 

In one embodiment, the polynucleotide is a GBS polynucleotide which is 
homologous to at least one gene from GAS and is not homologous to any gene from 
pneumococcus and which varies among clinical isolates. In another embodiment, the 
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polynucleotide is a GBS polynucleotide which is homologous to at least one gene 
from GAS and is not homologous to any gene from pneumococcus and which is 
homologous to at least one gene from at least one of the clinical isolates identified in 
Table 5. In another embodiment, the polynucleotide is a GBS polynucleotide which 
is homologous to at least one gene from GAS and is not homologous to any gene from 
pneumococcus and which is homologous to at least one gene from each of the clinical 
isolates identified in Table 5. Preferably, the polynucleotide is selected from one of 
the genes listed in Table 7. 

In one embodiment, the polynucleotide is a GBS polynucleotide which is 
homologous to at least one gene from pneumococcus and is not homologous to any 
gene from GAS and which varies among clinical isolates. In another embodiment, the 
polynucleotide is a GBS polynucleotide which is homologous to at least one gene 
from pneumococcus and is not homologous to any gene from GAS and which is 
homologous to at least one gene from at least one of the clinical isolates identified in 
Table 5. In another embodiment, the polynucleotide is a GBS polynucleotide which 
is homologous to at least one gene from pneumococcus and is not homologous to any 
gene from GAS and which is homologous to at least one gene from each of the 
clinical isolates identified in Table 5. Preferably, the polynucleotide is selected from 
one of the genes listed in Table 7. 

In one embodiment, the polynucleotide is a GBS polynucleotide which is not 
homologous to any gene from GAS or pneumococcus and which varies among 
clinical isolates. In another embodiment, the polynucleotide is a GBS polynucleotide 
which is not homologous to any gene from GAS or pneumococcus and which is 
homologous to at least one gene from at least one of the clinical isolates identified in 
Table 5. In another embodiment, the polynucleotide is a GBS polynucleotide which 
is not homologous to any gene from GAS or pneumococcus and which is homologous 
to at least one gene from each of the clinical isolates identified in Table 5. Preferably, 
the polynucleotide is selected from one of the genes listed in Table 7. 
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The invention further provides an immunogenic composition comprising a 
polypeptide, or a fragment thereof, which is encoded by a polynucleotide sequence 
which is conserved across one or more clinical isolates of a Streptococcal species. 
Preferably, the polynucleotide is conserved across one or more Streptococcal clinical 
isolates selected from the Streptococcal species GBS, GAS and pneumococcus. More 
preferable, the polynucleotide is conserved across one or more GBS clinical isolates 
identified in Table 5. Still more preferably, the polynucleotide is conserved across 
one or more clinical isolates having one or more genes selected from the genes listed 
in Table 7. 

The invention further provides for an immunogenic composition comprising a 
polypeptide, or a fragment thereof, encoded by a polynucleotide selected from one or 
more of the Subsets of the invention. Accordingly, the invention provides for an 
immunogenic composition comprising a polypeptide encoded by a polynucleotide 
selected from one or more of the following Subsets: GBS Subset 1 , GBS Subset 2, 
GBS Subset 3, GBS Subset 4, GAS Subset 1, GAS Subset 2, GAS Subset 3, GAS 
Subset 4, Spn Subset 1, Spn Subset 2, Spn Subset 3, Spn Subset 4, GBS Subset 1(a), 
GBS Subset 2(a), GBS Subset 3(a), GBS Subset 4(a), GAS Subset 1(a), GAS Subset 
2(a), GAS Subset 3(a), GAS Subset 4(a), Spn Subset 1(a), Spn Subset 2(a), Spn 
Subset 3(a), Spn Subset 4(a), GBS Subset 1(b), GBS Subset 2(b), GBS Subset 3(b), 
GBS Subset 4(b), GBS Subset 5, GBS Subset 6, GBS Subset 6(a), GBS Subset 7, 
GBS Subset 8, GBS Subset 9, GBS Subset 10, GBS Subset 11, GBS Subset 12, GBS 
Subset 12(a), GBS Subset 12(b), GBS Subset 12(c), GBS Subset 12(d), GBS Subset 
12(e), GBS Subset 12(f), GBS Subset 12(g), GBS Subset 12(h), GBS Subset 12(1), 
GBS Subset 12(j), GBS Subset 12(k), GBS Subset 12(1), GBS Subset 12(m), GBS 
Subset 12(n), GBS Subset 12(o), GBS Subset 13(a), GBS Subset 13(b), GBS Subset 
13(c), GBS Subset 13(d), GBS Subset 13(e), GBS Subset 13(f), GBS Subset 13(g), 
GBS Subset 13(h), GBS Subset 13(i), GBS Subset 13(j). GBS Subset 13(k), GBS 
Subset 130), GBS Subset 13(m), GBS Subset 13(n), GBS Subset 13(o), GBS Subset 
13(p), GBS Subset 13(q), GBS Subset 14, GBS Subset 14(a), GBS Subset 14(b), GBS 
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Subset 14(c), GBS Subset 14(d), GBS Subset 14(e), GBS Subset 14(f), GBS Subset 
14(g), and GBS Subset 14(h). 

The invention provides for an immunogenic composition comprising a 
polypeptide, or a fragment thereof, encoded by a polynucleotide selected from one or 
more of the following Subsets: GBS Subset 1, GBS Subset 2, GBS Subset 3, and 
GBS Subset 4. 

The invention provides for an immunogenic composition comprising a 
polypeptide, or a fragment thereof, encoded by a polynucleotide selected from one or 
more of the following Subsets: GAS Subset 1, GAS Subset 2, GAS Subset 3, and 
GAS Subset 4. 

The invention provides for an immunogenic composition comprising a 
polypeptide, or a fragment thereof, encoded by a polynucleotide selected from one or 
more of the following Subsets: Spn Subset 1, Spn Subset 2, Spn Subset 3, and Spn 
Subset 4. 

The invention provides for an immunogenic composition comprising a 
polypeptide, or a fragment thereof, encoded by a polynucleotide selected from one or 
more of the following Subsets: GBS Subset 1(a), GBS Subset 2(a), GBS Subset 3(a), 
and GBS Subset 4(a). 

The invention provides for an immunogenic composition comprising a 
polypeptide, or a fragment thereof, encoded by a polynucleotide selected from one or 
more of the following Subsets: GAS Subset 1(a), GAS Subset 2(a), GAS Subset 3(a), 
and GAS Subset 4(a). 

The invention provides for an immunogenic composition comprising a 
polypeptide, or a fragment thereof, encoded by a polynucleotide selected from one or 
more of the following Subsets: Spn Subset 1(a), Spn Subset 2(a), Spn Subset 3(a), 
and Spn Subset 4(a). 

The invention provides for an immunogenic composition comprising a 
polypeptide, or a fragment thereof, encoded by a polynucleotide selected from one or 
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more of the following Subsets: GBS Subset 1(b), GBS Subset 2(b), GBS Subset 3(b), 
andGBS Subset 4(b). 

The invention provides for an immunogenic composition comprising a 
polypeptide, or a fragment thereof, encoded by a polynucleotide selected from GBS 
Subset 5. 

The invention provides for an immunogenic composition comprising a 
polypeptide, or a fragment thereof, encoded by a polynucleotide selected from one or 
more of the following Subsets: GBS Subset 6 and GBS Subset 6(a). 

The invention provides for an immunogenic composition comprising a 
polypeptide, or a fragment thereof, encoded by a polynucleotide selected from one or 
more of the following Subsets: GBS Subset 7. 

The invention provides for an immunogenic composition comprising a 
polypeptide, or a fragment thereof, encoded by a polynucleotide selected from one or 
more of the following Subsets: GBS Subset 8. 

The invention provides for an immunogenic composition comprising a 
polypeptide, or a fragment thereof, encoded by a polynucleotide selected from one or 
more of the following Subsets: GBS Subset 9. 

The invention provides for an immunogenic composition comprising a 
polypeptide, or a fragment thereof, encoded by a polynucleotide selected from one or 
more of the following Subsets: GBS Subset 10. 

The invention provides for an immunogenic composition comprising a 
polypeptide, or a fragment thereof, encoded by a polynucleotide selected from one or 
more of the following Subsets: GBS Subset 1 1 . 

The invention provides for an immunogenic composition comprising a 
polypeptide, or a fragment thereof, encoded by a polynucleotide selected from one or 
more of the following Subsets: GBS Subset 12, GBS Subset 12(a), GBS Subset 
12(b), GBS Subset 12(c), GBS Subset 12(d), GBS Subset 12(e), GBS Subset 12(f), 
GBS Subset 12(g), GBS Subset 12(h), GBS Subset 12(i), GBS Subset 12Q), GBS 
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Subset 12(k), GBS Subset 120), GBS Subset 12(m), GBS Subset 12(n), and GBS 
Subset 12(o). 

The invention provides for an immunogenic composition comprising a 
polypeptide, or a fragment thereof, encoded by a polynucleotide selected from one or 
more of the following Subsets: GBS Subset 13(a), GBS Subset 13(b), GBS Subset 
13(c), GBS Subset 13(d), GBS Subset 13(e), GBS Subset 13(f), GBS Subset 13(g), 
GBS Subset 13(h), GBS Subset 13(i), GBS Subset 13(j), GBS Subset 13(k), GBS 
Subset 13(1), GBS Subset 13(m), GBS Subset 13(n), GBS Subset 13(o), GBS Subset 
13(p), GBS Subset 13(q). 

The invention provides for an immunogenic composition comprising a 
polypeptide or a fragment or derivative thereof encoded by a polynucleotide selected 
from one or more of the following Subsets: GBS Subset 14, GBS Subset 14(a), GBS 
Subset 14(b), GBS Subset 14(c), GBS Subset 14(d), GBS Subset 14(e), GBS Subset 
14(f), GBS Subset 14(g), and GBS Subset 14(h). 

The invention further provides a method for designing an immunogenic 
composition, such as a vaccine, by selecting a polypeptide encoded by a 
polynucleotide selected from one or more of the Subsets of the invention. 

The invention provides a method for raising an immune response in a patient 
by administering any one of the immunogenic compositions set forth above. The 
choice of immunogenic composition means that the immune response may be reactive 
against all three of GAS, GBS and streptococcus, may be reactive against only two of 
the three, or may be reactive only against GBS. 

Each of the immunogenic compositions described above may be prepared and 
administered instead as a polynucleotide where the polypeptide is expressed in vivo. 

The immune response is preferably an antibody response. It may be a 
protective immune response. The patient is preferably a human. 
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Essential g ^nes and knockouts 

The invention provides a Streptococcus bacterium wherein one or more genes 
within any of the Subsets of this invention have been knocked out The choice of 
Subset means that the knocked out gene may be, for instance, a gene found in GBS 
but not in GAS or pneumococcus (e.g. which is involved in the pathogenesis of GBS, 
but not in the pathogenesis of GAS or pneumococcus, such as binding GBS cellular 
targets). 

Techniques for producing knockout bacteria are well known, and knockout 
Streptococci of various species have been reported [e.g. Margolis et al (2001) 
Atitimicrob. Agents Chemother. 45:2432-2435; Zhang et al (2000) Cell 102:827-837; 
Nizet et al (2000) Infect. Immun. 68:4245-4254; Nizet et al. (1997) Adv. Exp. Med. 
Biol 418:627-630; etc.]. 

The knockout mutation may be situated in the coding region of the gene or 
may lie within its transcriptional control regions (e.g. within its promoter). 

The knockout mutation will reduce the level of mRNA encoding the 
corresponding polypeptide to <1% of that produced by the wild-type bacterium, 
preferably <0.5%, more preferably <0.1%, and most preferably to 0%. 

Hie knockout mutants of the invention may be used as immunogenic 
compositions (e.g. as vaccines) to prevent streptococcal infection. Such a vaccine 
may include the mutant as a live attenuated bacterium. 

The knockout mutants of the invention may be used to determine whether 
genes are essential for bacterial survival, either under normal or stress conditions, 

i 

Antisense 

The invention provides a single-stranded nucleic acid comprising a fragment 
of xj or more nucleotides from a nucleotide sequence selected from one of the Subsets 
of the invention. Ttie choice of group means that the nucleic acid may be 
complementary to a gene sequence found in GBS, GAS and pneumococcus, or a gene 
sequence specific to GBS. 
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The single-stranded nucleic acid is at least xj nucleotides long. The value of xj 
is at least 7 (e.g. 8, 9, 10, 1 1, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 
27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 
50 etc.). The single-stranded nucleic acid may be at most xi nucleotides long, 
wherein x 2 is 100 or less (e.g. 99, 98, 97, 96, 95, 94, 93, 92, 91, 90, 89, 88, 87, 86, 85, 
84, 83, 82, 81, 80, 79, 78, 77, 76, 75, 74, 73, 72, 71, 70, 69, 68, 67, 66, 65, 64, 63, 62, 
61,60). 

Hie nucleic acid is preferably of the formula 5HNV^X)-(NV-3\ wherein 
0>fl>15 9 0>fc>15, N is any nucleotide, and X is the fragment as defined above. The 
values of a and b may independently be 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 or 
15. Each individual nucleotide N in the -(N)*- and -<N)a- portions of the nucleic acid 
may be the same or different. The length of the nucleic acid (Le. a+fc+jc/) is 
preferably X2 or less. 

Antisense inhibition of streptococcal gene expression is known e.g. Sato etal 
(1998) FEMS Microbiol Lett 159:241-245. Antibacterial antisense techniques are 
also disclosed in international patent applications WO99/02673 and W099/13893. 

The single-stranded nucleic acid may reduce the level of polypeptide 
expression from the complementary gene to <1 % of that produced by the wild-type 
bacterium, preferably <0.5%, more preferably <0.1%, and most preferably to 0%. 

Antisense experiments may be used to determine whether genes are essential 
for bacterial survival, either under normal or stress conditions. 

Screening methods 

The invention provides a method for screening compounds, wherein the 
" method involves contacting the compounds with a polypeptide expressed by one or 
more of the polynucleotides selected from one of the Subsets of the invention. The 
method may be for screening for agonists of the polypeptides, antagonists, antibiotics 
etc. The choice of group means, for instance, that the method may be used for 
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identifying an antibiotic with broad anti-streptococcal activity could be identified, or 
for identifying an antibiotic specific to GBS. 

Potential compounds for screening include small organic molecules, peptides, 
peptoids, polypeptides, lipids, metals, nucleotides, nucleosides, aptamers, polyamines, 
antibodies, and derivatives thereof. Small organic molecules have a molecular weight 
between 50 and about 2,500 daltons, and most preferably in the range 200-800 
daltons. Complex mixtures of substances, such as extracts containing natural 
products, compound libraries or the products of mixed combinatorial syntheses also 
contain potential antagonists. 

Typically, a polypeptide is incubated with a test compound, and the mixture is 
then tested to see if the polypeptide and test compound interact, or to see if the 
polypeptide's activity is inhibited. 

For preferred high-throughput screening methods, all the biochemical steps for 
this assay are performed in a single solution in, for instance, a test tube or microti tre 
plate, and the test compounds are analysed initially at a single compound 
concentration. For the purposes of high throughput screening, the experimental 
conditions are adjusted to achieve a proportion of test compounds identified as 
"positive" compounds from amongst the total compounds screened. 

The invention also provides a compound identified using these methods. 
These can be used to treat or prevent streptococcal infection. The compound 

V -8 

preferably has an affinity for the adhesion-specific protein of at least 10" M e.g. 10 
M, 10' 9 M, 10' 10 M or tighter. 

Distinguishing Streptococcal species 

The invention provides a method for determining whether a Streptococcus 
bacterium of interest is or is not in the species agalactiae, pyogenes or pneumoiae, 
comprising the step(s) of: (a) contacting the bacterium with a nucleic acid probe 
comprising the sequence of a gene selected from one of the Subsets of the invention; 
and/or (b) contacting the bacterium with an antibody which binds to a polypeptide 
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encoded by one or more of the polynucleotides of one or more of the Subsets of the 
invention. The choice of group means, for instance, that the method may be used for 
distinguishing GBS from GAS and from pneumococcus, or for confirming that a 
bacterium is not a GAS or pneumococcus. 

The method will typically include the further st^p of detecting the presence or 
absence of an interaction between the bacterium of interest and the nucleic acid or 
protein. 

The bacterium of interest may be in a cell culture, for example, or may be 
within a biological sample believed or known to contain a streptococcus. It may be 
intact or may be, for instance, lysed. 

The term "biological sample" encompasses a variety of sample types obtained 
from an organism and can be used in a diagnostic or monitoring assay. The term 
encompasses blood and other liquid samples of biological origin, solid tissue samples, 
such as a biopsy specimen or tissue cultures or cells derived therefrom and the 
progeny thereof. The term encompasses samples that have been manipulated in any 
way after their procurement, such as by treatment with reagents, solubilization, or 
enrichment for certain components. The term encompasses a clinical sample, and also 
includes cells in cell culture, cell supernatants, cell lysates, serum, plasma, biological 
fluids, and tissue samples. 

GBS 2603 Tvoe V Genomic Sequence 

Applicants have sequenced the complete genome sequence of GBS clinical 
type V isolate 2603 V/R and performed comparative analyses comparing this 
sequence with other GBS strains, with other species of pathogenic Streptococci and 
with other known bacterial species. The entire genomic sequence is available as of 
the filing date of this application at http://www.tigr.orp . This genomic sequence is 
incorporated herein by reference in its entirety. The genomic sequence of GBS type 
V isolate 2603 V/R is also set forth in International Patent Application WO 02/34771. 
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In one embodiment, the invention relates to the polynucleotides, and 
fragments and derivatives thereof, set forth in the GBS clinical type V isolate 2603 
published genome which are not disclosed within WO 02/34771. The invention 
further relates to polypeptides expressed by the polynucleotides of the invention. 

Applicants have predicted that the GBS 2603 isolate contains approximately 
2,176 predicted genes. Each predicted gene is set forth in Table 1, listed by a 
SAGxxxx ORF number. Table 1 also includes the predicted amino acid size of the 
predicted expressed protein and the predicted function, if known. The sequence of 
each SAG reference can be obtained at the TTGR website. 

Figure 1 is a circular representation of the GBS genome and comparative 
hybridisations using microarrays. The outer circle represents predicted coding 
regions on the plus strand color coded by role categories: violet indicating amino acid 
biosynthesis; light blue indicating biosynthesis of cofactors, prosthetic groups, and 
carriers; light green indicating cell envelope; red indicating cellular processes; brown 
indicating central intermediary metabolism; yellow indicating DNA metabolism; light 
gray indicating energy metabolism; magenta indicating fatty acid and phospholipid 
metabolism; pink indicating protein synthesis and fate; orange indicating purines, 
pyrimidines, nucleosides, and nucleotides; olive indicating regulatory functions and 
signal transduction; dark green indicating transcription; teal indicating transport and 
binding proteins; gray indicating unknown function; salmon indicating other 
categories; blue indicating hypothetical proteins. 

The second circle represents predicted coding regions on the minus strand. In 
the third circle, black represents atypical nucleotide composition curve; green 
represents most atypical regions; magenta represents insertion elements; red diamonds 
indicate rRNAs. 

Circles 4-22 represent comparative hybridisations of strain 2603 V/R with 19 
GBS strains. Cy3/Cy5 (2603 V/R signal/test strain) ratio cutoffs were defined 
arbitrarily as Cy3/Cy5 - 1.0 - 3.0, the gene was present in the test strain, no color was 
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added; Cy3/Cy5 = 3.0 - 10.0, ambiguous result (blue); Cy3/Cy5 > 10, gene absent in 
test strain (red). 

Circles 4-9 represent type la strains 090, 515, A909, Davis, and DK8. 
Circles 10 - 1 1 represent type lb strains S7 7357b and H36B. Circles 12 - 13 
represent type II strains 18RS21 and DK21. Circles 14 - 18 represent type III COH1, 
COH31, D136C, M732 andM781. Circle 19 represents type V strain CJB111. 
Circles 20 - 21 represent type VIII strains SMU014 and JM91 30013. Circle 22 
represents nontypable (NT) strain CJB110. Throughout Figure 1, varying regions of 
five or more consecutive genes are indicated by yellow bullets. 

Figure 4 depicts a linear representation of the GBS genome. The location of 
predicted coding regions coloi;-coded by biological role (see Figure 1) is displayed. 
Arrowed boxes represent the direction of transcription for each ORF. The number of 
membrane-spanning domains predicted by TopPred is displayed as lipid bi-layers on 
top of ORFs, only for those whose products have five or more predicted membrane 
spanning regions. Genes coding for rRNAs (16S, 23S, 5S) and tRNAs (clover leaf 
structure with number of genes) are indicated. Predicted Rho-independent 
transcriptional terminators are represented by hairpins. 

ORF's were predicted by GLIMMER (See, Delcher, et al., (1999) Nucleic 
Acids Res. 27, 4636 - 4641 and Salzberg, et al., (1998) Nucleic Acids Res. 26, 544- 
548) trained with ORFs larger than 600 base pairs from the genomic sequence and 
GBS genes available in GenBank. All predicted proteins larger than 30 amino acids 
were searched against a nonredundant protein database. (See Fleischmann, et al., 
(1995) Science 269, 496 - 512). Frame-shifts and point mutations were detected and 
corrected where appropriate; those remaining were annotated as "authentic frame- 
shift" or "authentic point mutation". Protein membrane-spanning domains were 
identified by TOPPRED (See Claros, et al., (1994) Comput AppL BioscL 10, 685 - 
686). Candidate lipoprotein signal peptides (See Hayashi et al., (1990) J. Bioenerg. 
Biomembr. 22, 451 - 471) were flagged by N-terminal exact matches to the pattern 
{DERK} (6)-[LIVMFWSTAG] (2)-[LIVMFYSTAGCQ] - [AGS] - C. Putative 
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signal peptides were identified by using SIGNALP (Nielsen, et aL, (1997) Protein 
Eng. 10, 1 - 6). Two sets of hidden Markov models were used to determine ORF 
membership in families and superfamilies: PFAM Ver. 5.5 (Bateman, et ah, (2000) 
Nucleic Acids Res. 28, 263 - 266) and TIGRFAMS 1.0 (Haft et al. f (2001) Nucleic 
Acids Res. 29, 41 - 43). Domain-based paralogous families were built by performing 
all-versus-all searches on the protein sequences by using a modified version of a 
previously described method. (Niemann, et aL, (2001) Proc. Natl Acad ScL USA 
98, 4136 - 4141) Potential lineage-specific gene duplications were estimated by 
identification of OFRs more similar to ORFs within the GBS genome than to ORFs 
from other complete genomes. All ORFs were searched with FASTA3 (Pearson 
(2000) Methods Mol Biol 132, 185 - 219) against all ORF's from the complete 
genomes and matches with a FASTA P value of 10" 15 were considered significant 

The genome consists of a circular chromosome of 2,160,266 base pairs with a 
G+C content of 35.7%. Base pair one of the chromosome was assigned within die 
putative origin of replication. The genome contains 80 tRNAs, 7rRNAs, and 3 
sRNAs. Approximately 78% of the 2,176 predicted genes are transcribed in the same 
direction as that of DNA replication, a feature also observed in S. pn. and other low- 
GC Gram positive organisms. 

Biological roles were assigned to 1,409 (65%) of the genome according to a 
classification scheme adapted from Riley (1993) Microbiol Rev. 57, 862 - 952. 
Another 527 predicted proteins (24%) matched proteins of unknown function, and the 
remaining 240 (11%) had no database match. Hie expression of 50 of these 
hypothetical proteins was confirmed by Western Blot analysis, and the proteins were 
annotated as "proteins of unknown function." A total of 339 paralogous protein 
families were identified in strain 2603, containing 941 predicted proteins (43% of the 
total). 

The Western Blot analysis was conducted as follows. GBS strain 2603 V/R 
cells were grown in Todd-Hewitt broth (Difco) to OD600nm = 0.5. The culture was 
centrifuged for 20 minutes at 5,000 rpm. The supernatant was discarded, and bacteria 
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were washed once with PBS, resuspended in 2 ml of 50 mM Tris-HCl pH 6.8, 
containing 400 units of Mutanolysin (Sigma), and incubated 2 hours at 37°C After 
three cycles of freeze and thaw, cellular debris was removed by centrifugation at 
14,000 rpm for 10 minutes, and the protein concentration of the supernatant was 
measured by the Bio-Rad Protein assay, with BSA as a standard. Purified 
recombinant proteins (50 ng) and total cell extracts (25 \ig) derived from GBS 
serotype V 2603 V/R strain were separated by SDS/PADE and electroblotted onto 
nitrocellulose membranes for 1 hour at 100 V. The membranes were saturated by 
overnight incubation at 4° C in 5% skimmed milk and 0.1% Tween 20 in PBS and 
incubated for 1 hour at room temperature with sera from immunized mice diluted 
1:500 - 1:1,000 in saturation buffer. To reduce background due to antibodies raised 
against contaminating E. coli proteins, sera were preincubated with K coli protein 
extracts absorbed on nitrocellulose strips. The membranes were washed twice in 3% 
skimmed milk and 0.1% Tween 20 in PBS and incubated for 1 hour with a 1:1,000 
dilution of horseradish peroxidase-conjugated antimouse Ig (DAKO). After washing 
with 0.1% Tween 20 in PBS, the membranes were developed with the Opti-4CN 
Substrate Kit (Bio-Rad). 

Table 2 comprises a list of predicted and experimentally characterized surface 
and secreted proteins from GBS. Candidate signal peptides and lipoprotein motifs 
were predicted with PSORT [Nakai, K. & Horton, P. (1999) Trends Biochem Sci 24, 
34-6] and other methods (see methods), sortase motifs (LPxTG) were detected using 
the HNDPATTERNS program of the GCG Package [Devereux, J., Haeberli, P. & 
Smithies, O. (1984) Nucleic Acids Res 12, 387-95] and hidden Markov models. 
Column "Other" indicates proteins carrying other motifs (e.g. integrin-binding motif 
RGD) or are similar to characterized surface-exposed proteins. Western blot results 
were considered positive when the antibodies revealed a predominant band of the 
expected molecular weight on the total protein extracts of S. agalactiae strain 2603 
V/R, ORFs without + or - in this column were not tested in western blot FACS 
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analyses were performed for western blot positive proteins only. Western blot and 
FACS data are displayed only for proteins carrying at least one of the other motifs 
shown in the table. Column "GBS specific" indicates genes unique to S. agalactiae 
(when compared to, other completely sequenced genomes) that are present in all the 
S. agalactiae strains tested in comparative genome hybridization analyses. Finally, 
only proteins carrying less than 3 predicted transmembrane domains are shown in the 
table, other proteins are likely to be embedded in the cytoplasmic membrane and are 
probably not exposed on the organism's surface. 

FACS data was collected as follows: GBS 2603 V/R strain cells were grown 
in Todd-Hewitt broth (Difco) to OD600nm = 0.5. The culture was centrifuged for 20 
minutes at 5,000 ipm, and bacteria were washed once with PBS, resuspended in PBS 
containing 0.05% paraformaldehyde, and incubated for 1 hour at 37°C and then 
overnight at 4°C Fifty microliters of fixed bacteria (OD600nm 0.1) was washed once 
with PBS, resuspended in 20 |xl of newborn calf serum (Sigma), and incubated for 1 
hour at 4°C in 100pl of preimmune or immune sera and diluted 1:200 in dilution 
buffer (PBS, 20% newborn calf serum, 0.1% BSA). After centrifugation and washing 
with 200|xl of washing buffer (0.1% BSA in PBS), samples were incubated for 1 hour 
at 4°C with 50 Ml of R-phycoeiythrin-conjugated F(ab)2 goat anti-mouse IgG 
(Jackson ImmunoResearch) diluted 1:100 in dilution buffer. Cells were washed with 
200 ill of washing buffer and resuspended in 200 |Xl of PBS. Samples were analysed 
by using a FACS calibur apparatus (Becton Dickinson), and data were analyzed by 
using CELL QUEST (Becton Dickinson). A shift in mean fluorescence intensity of 
>75 channels compared with preimmune sera from the same mice was considered 
positive. This cutoff was determined from the mean plus two standard deviations of 
shifts obtained with control sera raised against mock purified recombinant proteins 
from cultures of E. coli carrying the empty expression vector and included in every 
experiment Artifacts due to bacterial lysis were excluded by using antisera raised 
against six different known cytoplasmic proteins, all of which gave negative results. 
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Rejgions of Atypical Nucleotide Composition. 

These regions were identified by the x 2 analysis: the distribution of all 64 
trinucleotides (3 mers) was computed for the complete genome in all six reading 
frames, followed by the 3-mer distribution in 2,000-bp windows. Windows 
overlapped by 1,000 bp. For each window, the x 2 statistic on the difference between 
its 3-mer content, and that of the whole genome was computed. 

Jn Silico Genome Comparisons 

The protein sets of S. agalactiae, Streptococcus pneumoniae and S. pyogenes 
were compared by using FASTA3. A general description of the FASTA3 sequence 
comparison program is discussed in Pearson, W.R., "Flexible Sequence Similarity 
Searching with the FASTA3 Program Package", (2000) Methods MoL BioL, 132: 
185-219. Shared genes were defined using a FASTA3P value cutoff of 10 15 . These 
shared genes and genes that S. agalactiae did not share with the other streptococci 
using this cutoff were subsequently searched against all completely sequenced 
genomes, and genes were defined as unique to streptococci or S. agalactiae when they 
did not share similarity with any other gene sets with a FASTA3 P value of 10 s or 
lower. The use of two cutoffs provides for a more stringent analysis of shared or 
unique genes. 
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Figure 2 is a schematic representation of in silico comparisons between 
streptococci. The protein sets of GBS, S. pn., and GAS were compared by using 
FASTA3. Numbers under the species name indicate genes that are not shared with 
the other species; values in parenthesis are the number of proteins in each species 
(excluding frame-shifted and degenerated genes). Numbers in the intersections 
indicate genes shared by two or three species. These are displayed in the color 
corresponding to the species used as the query. (GBS: green; S.pn.: blue; GAS: 
red). Numbers in any given intersection are slightly different due to gene duplications 
in some species. 

Table 3 lists genes which were shared among GBS, GAS and pneumococcus, 
but which were not found in any of the other completely sequenced genomes. The 
protein sets of S. agalactiae, S. pneumoniae, and S. pyogenes were compared using 
FASTA3 [Pearson, W. R. (2000) Methods Mol Biol 132, 185-219]. Shared genes 
were defined using a FASTA3 p value cutoff of 10' 15 . These shared genes and genes 
that S. agalactiae did not share with the other streptococci using this cutoff were 
subsequently searched against all completely sequenced genomes and genes were 
defined as unique to streptococci or S. agalactiae when they did not share similarity 
with any other gene sets with a FASTA3 p value of 10 s or lower. 

Synteny 

Regions of conservation of gene synteny were computed as windows of 10 kb 
spanning at least three genes whose order was conserved in the other species. 
Regions were merged if they were less than 20 kb apart The number of genes within 
each broad region was then calculated. 

Comparative Genome Hybridizations 

Comparative genome hybridizations (See Figure 1) using DNA microarrays 
were performed between the sequenced type V strain 2603 V/R and 19 other GBS 
strains of multiple serotypes (See Table %). Predicted genes from strain 2603 V/R 
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were amplified by PCR and arrayed on glass microscope slides. See Peterson, et al., 
(2000) 7. Bacterid. 182, 6192-6202. Genomic DNA was labelled according to 
protocols provided by J. DeRisi rwww.micro arravs.org/Pdfs/Genomic- 
PNALabel B.pdfi , except that the DNA was notdigested or sheared before labelling. 
Arrays were scanned with a GENEPIX 4000B scanner (Axon Instruments, Foster 
City, CA), and individual hybridisation signals were quantitated with TIGR 
SPOTFINDER. See Hedge, et al., (2000), Biotechniques 29, 548-550, 552-554, 556. 
Cy3/Cy5 (2603 V/R signal/test strain) ratio cutoffs were defined arbitrarily as 
Cy3/Cy5 = 1.0- 3.0, gene present in test strain; 3.0 - 10.0, ambiguous result; >10.0, 
gene absent For ambiguous results, the gene may be divergent in the test strain 
relative to 2603 V/R, or die gene may be absent in the test strain but still produces 
paralogous gene family or a repetitive elemtn. Although cutoffs are arbitrary, they fit 
nicely the results for the variation of the capsule locus in the strains tested (see region 
9 on Figure 1) where most genes are slightly divergent and only a few are completely 
different 

The CGH detected 1,698 genes in all of the strains, whereas 401 genes from 
strain 2603 V/R (18% of the gene complement) were not detected in at least one other 
strain, suggesting that they are absent or significandy divergent in those strains. Two 
hundred sixty (38%) of the 683 genes specific to 5. agalactiae when compared with 
the other two streptococci (Fig. 2), including virulence determinants and surface 
proteins, vary among S. agalactiae strains, whereas only 47 (4%) of the genes 
common to all three streptococcal species, including 5 of the 6 sortases identified in 
the genome, vary among strains. Thus, the in silico analysis of genes shared by the 
streptococci that are not expected to vary among this genus is consistent with the 
CGH analysis. Forty-four (25%) of the genes shared by 5. agalactiae and S. 
pneumoniae and 44 (20%) of those shared by 5. agalactiae and S. pyogenes vary in 
the CGH analysis. The first set contains many glycosyl transferases and proteins 
carrying a cell-wall anchor, whereas the second set displays many phage-related 
genes. One hundred thirty-six of the 315 genes unique to S. agalactiae when 
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compared with all sequenced genomes vary among strains. These include R5, three 
capsular genes, two cell wall-anchored proteins, and three transcriptional regulators. 
Three hundred sixty-four (91%) of the 401 varying genes correspond to 15 regions 
containing more than 5 contiguous genes. Ten of these regions display an atypical 
nucleotide composition in strain 2603 V/R (Fig. 1), consistent with the possibility that 
they were horizontally transferred into this strain. Two of the largest regions (region 
4, a prophage and region 7, similar to Tn916 from Enterococcus faecalis) are flanked 
by insertion sequence elements. The 15 regions contain many proteins predicted to be 
anchored on the cell wall or surface exposed, including Rib (region 3), sortases, 
glycosyl transferases, the capsule locus (region 9, divergent in all strains but the other 
type V strain CJB1 1 1), and phage-related genes. Region 14 is unique to S. agalactiae 
and spans 33 genes (SAG1989- SAG2021), including 25 proteins of unknown 
function, some of which carry a cell-wall anchor. It is flanked by an ISL3 transposase 
and displays an atypical nucleotide composition. Region 1, unique to S. agalactiae, is 
a possible plasmid or remnant of a phage (SAG0218-SAG0238), contains mostly 
hypothetical proteins, and is flanked by a site-specific recombinase. Region 8 is 
specific to S. agalactiae, comprises 20 proteins of unknown function (SAG1018- 
SAG1037), most of which are predicted to be membrane associated or secreted, and 
displays an atypical nucleotide composition. 

The CGHresults were analyzed by profile clustering where genes are grouped 
based on their distribution patterns (Rg. 5). Sixteen clusters of five or more 
contiguous and noncontiguous genes comprising a total of 300 genes were identified 
(Table 6). Several clusters correspond to regions of contiguous genes described 
above. Some clusters of genes that do not share sequence similarity and are located at 
different loci in the genome display an identical profile. For instance, a cluster of 
genes containing a surface antigen (SAG0674-SAG0681) follows the same 
distribution as another cluster containing only hypothetical proteins (SAG0247- 
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SAG0249). A putative pathogenicity protein (SAG2063) also clusters with a region 
containing several glycosyl transferases and Sec proteins (SAG1447-SAG1462). 

Profile clustering was also used to group strains based on similarity of gene 
content (Fig. 5). In addition, the sequences of 19 genes from each of 1 1 S. agalactiae 
strains were determined after PGR amplification and used for phylogenetic analyses. 
The strains were the following: type la, 090 and A909; type lb, H36B; type H, 
18RS21 ; type HI, COH1, M732 and M781 ; type V, 2603 V/R and 1 169NT1 ; type 
VIE, JM9130013; and nontypeable strain CJB 1 10. The set comprised 8 
housekeeping genes and 1 1 genes coding for proteins predicted to be surface-exposed 
(Table 7). 

The profile clustering was conducted as follows. The information and absence 
of genes based on the comparative genome hybridisation results was used to group 
genes based on their distribution patterns. The analysis used was essentially identical 
to that used for phylogenetic profile analysis. See Pellegrinie, et al., (1999) Proc. 
Natl Acad. ScL USA 96; 4285 - 4288. Each gene was assigned a binary profile based 
on its presence or absence across the different strains, with presence determined by a 
Cy3/Cy5 ratio < 3.0 and absence ^ 3.0. The gene profiles were then clustered by 
using the single-linkage clustering algorithm with column weighting (all with default 
settings) of CLUSTER (http://rana.lbl.gov) . The CLUSTER program also groups the 
strains (columns) based on similarity of gene profiles. Clusters of genes and strains 
were viewed by using TREEVIEW nittp://rana.lbLgoV>. 

Phylogenetic trees were inferred for the complete set of 19 genes and for the 
subsets of housekeeping and surface-exposed genes. Because the branching patterns 
in all three trees were identical, only the tree of the 19 genes is shown in Fig. 3. The 
degree of polymorphism of the housekeeping and the surface-exposed genes is similar 
(-1 variable site among all of the strains per 100 bp). 

The sequences of genes from the different strains were aligned by using 
CLUSTALW (See Thompson (1994), Nucleic Acids Res. 22, 4673 - 4680.) and 
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trimmed to remove ambiguously aligned regions. Phylognetic trees of individual 
genes and of concatenated alignments of multiple genes were inferred by using 
maximum likelihood methods of PAUP* 4.0 blO (Sinauer, Sunderland, MA). 
Bootstrap analysis was carried out using PAUP* as well. The possibility of 
recombination among strains was examined by using analysis of sequence variation 
using SIMPLOT (S.C. Ray) and analysis of phylogenetic heterogeneity by using 
MACCLADE (Sinauer). 

Analysis of this variation showed no evidence for major recombination events 
between the strains. There were no long stretches of polymorphic sites that strongly 
supported other trees (analysis with MACCLADE), and there were no significant 
crossover events in plots of sequence similarity between strains (analysis with 
SIMPLOT). Some strain groupings (clades) generated by phylogenetic analysis were 
similar to clusters from the profile analysis (type IH strains M781, M732 and COH1; 
type la strain 090 and nontypable strain CJB1 10), whereas others were different, 
possibly because of the aforementioned problems with the profile clustering. In both 
the phylogenetic analysis and the profile clustering, there is serotypedependent and - 
independent clustering (Figs. 3 and 5). The presence of strains of the same serotype 
in different clades or clusters could be due to lateral gene transfer. 

Figure 5 demonstrates phylogenetic profiling of GBS strains based oh 
comparative genome hybridisations. The information on presence and absence of 
genes based on the microarray comparative genome hybridization results was used for 
phylogenetic profile analysis. The presence of a particular gene or gene cluster is 
indicated in the figure by a red square and the absence of a gene or cluster by a black 
square. The relationship between strains based on this analysis is depicted by the tree 
at the top of the figure. The strains and their serotypes are indicated (NT: 
nontypeable). Clusters with identical profiles are reduced to a single horizontal line 
and the number of genes in each cluster is indicated on the right. The clusters of 5 or 
more genes, labeled in red text and numbered, are listed in Table 6. The 1698 genes 
shared by all 19 strains are labeled in green text 
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Figure 3 depicts a phylogenetic tree of GBS strains based on PCR sequences. 
Hie sequences of 19 genes (Table 7) from each of 1 1 GBS strains were aligned and 
trimmed to remove ambiguously aligned regions, and phylogenetic trees were 
inferred. Strain names are indicated in bold, and serotypes are indicated under the 
strain names. Bootstrap values are indicated on the branches. 

Techniques 

A summary of standard techniques and procedures which may be employed in 
order to perform the invention (e.g. to utilise the disclosed sequences for vaccination 
or diagnostic purposes) follows. This summary is not a limitation on the invention, 
but gives examples that may be used, but are not required. 

General 

Ihe practice of the present invention will employ, unless otherwise indicated, 
conventional techniques of molecular biology, microbiology, recombinant DNA, and 
immunology, which are within the skill of the art Such techniques are explained fully in the 
literature eg. Sambrook Molecular Cloning; A Laboratory Manual, Second Edition (1989) or 
Third Edition (2000); DNA Cloning, Volumes I and II (D.N Glover ed. 1985); Oligonucleotide 
Synthesis (M J. Gait ed, 1984); Nucleic Acid Hybridization (B.D. Hames & S.J. Higgins eds. 
1984); Transcription and Translation (BX>. Hames & S.J. Higgins eds. 1984); Animal Cell 
Culture (R.I. Freshney ed. 1986); Immobilized Cells and Enzymes (IRL Press, 1986); B. Perbal, 
A Practical Guide to Molecular Cloning (1984); the Methods in Enzymology series (Academic 
Press, Inc.), especially volumes 154 & 155; Gene Transfer Vectors for Mammalian Cells (J.H. 
Miller and M.P. Calos eds. 1987, Cold Spring Harbor Laboratory); Mayer and Walker, eds. 
(1987), Immunochemical Methods in Cell and Molecular Biology (Academic Press, London); 
Scopes, (1987) Protein Purification: Principles and Practice, Second Edition (Springer-Verlag, 
N.Y.), and Handbook of Experimental Immunology, Volumes HV (D.M. Weir and C. C. 
Blackwell eds 1986). 

Standard abbreviations for nucleotides and amino acids are used in this specification. 
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Further Definitions 

A composition containing X is "substantially free of 1 Y when at least 85% by weight of the 
total X+Y in the composition is X. Preferably, X comprises at least about 90% by weight of the 
total of X+Y in the composition, more preferably at least about 95% or even 99% by weight. 
The term "comprising" means "including" as well as "consisting'* e.g. a composition 
"comprising" X may consist exclusively of X or may include something additional ag. X + Y. 
The singular forms "a", "and", and "the" include plural referents unless the context clearly 
dictates otherwise. Thus, for example, reference to "a polynucleotide" includes a plurality of 
such polynucleotides and reference to "an epithelial cell" includes reference to one or more 
cells and equivalents thereof known to those skilled in the art, etc. 
Hie term "heterologous" refers to two biological components that are not found together in 
nature. The components may be host cells, genes, or regulatory regions, such as promoters. 
Although the heterologous components are not found together in nature, they can function 
together, as when a promoter heterologous to a gene is operably linked to the gene. Another 
example is where a Streptococcal sequence is heterologous to a mouse host cell. A further 
examples would be two epitopes from the same or different proteins which have been 
assembled in a single protein in an arrangement not found in nature. 

An "origin of replication" is a polynucleotide sequence that initiates and regulates replication of 
polynucleotides, such as an expression vector. Hie origin of replication behaves as an 
autonomous unit of polynucleotide replication within a cell, capable of replication under its 
own control. An origin of replication may be needed for a vector to replicate in a particular host 
cell. With certain origins of replication, an expression vector can be reproduced at a high copy 
number in the presence of the appropriate proteins within the cell. Examples of origins are the 
autonomously replicating sequences, which are effective in yeast; and the viral T-antigen, 
effective in COS-7 cells. 

A "mutant" sequence is defined as DNA, RNA or amino acid sequence differing from but 
having sequence identity with the native or disclosed sequence. Depending on the particular 
sequence, the degree of sequence identity between the native or disclosed sequence and the 
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mutant sequence is preferably greater than 50% (eg. 60%, 70%, 80%, 90%, 95%, 99% or more, 
calculated using the Smith- Waterman algorithm as described above). As used herein, an "allelic 
variant" of a nucleic acid molecule, or region, for which nucleic acid sequence is provided 
herein is a nucleic acid molecule, or region, that occurs essentially at the same locus in the 
genome of another or second isolate, and that, due to natural variation caused by, for example, 
mutation or recombination, has a similar but not identical nucleic acid sequence. A coding 
region allelic variant typically encodes a protein having similar activity to that of the protein 
encoded by the gene to which it is being compared. An allelic variant can also comprise an 
alteration in the 5' or 3' untranslated regions of the gene, such as in regulatory control regions 
(eg. see US patent 5,753,235). 

Expression systems 

The Streptococcal nucleotide sequences can be expressed in a variety of different expression 
systems; for example those used with mammalian cells, baculoviruses, plants, bacteria, and 
yeast 

i. Mammalian Systems 

Mammalian expression systems are known in the art A mammalian promoter is any DNA 
sequence capable of binding mammalian RNA polymerase and initiating the downstream (3*) 
transcription of a coding sequence (eg, structural gene) into mRN A. A promoter will have a 
transcription initiating region, which is usually placed proximal to the 5' end of the coding 
sequence, and a TATA box, usually located 25-30 base pairs (bp) upstream of the transcription 
initiation site. The TATA box is thought to direct RNA polymerase II to begin RNA synthesis 
at the correct site. A mammalian promoter will also contain an upstream promoter element, 
usually located within 100 to 200 bp upstream of the TATA box. An upstream promoter 
element determines the rate at which transcription is initiated and can act in either orientation 
[Sambrook et aL (1989) "Expression of Cloned Genes in Mammalian Cells." In Molecular 
Cloning: A Laboratory Manual, 2nd ed.J. 
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Mammalian viral genes are often highly expressed and have a broad host range; therefore 
sequences encoding mammalian viral genes provide particularly useful promoter sequences. 
Examples include the SV40 early promoter, mouse mammary tumor virus LTR promoter, 
adenovirus major late promoter (Ad MLP), and herpes simplex virus promoter. In addition, 
sequences derived from non- viral genes, such as the murine metallotheionein gene, also provide 
useful promoter sequences. Expression may be either constitutive or regulated (inducible), 
depending on the promoter can be induced with glucocorticoid in hormone-responsive cells. 
The presence of an enhancer element (enhancer), combined with the promoter elements 
described above, will usually increase expression levels. An enhancer is a regulatory DNA 
sequence that can stimulate transcription up to 1000-fold when linked to homologous or 
heterologous promoters, with synthesis beginning at the normal RNA start site. Enhancers are 
also active when they are placed upstream or downstream from the transcription initiation site, 
in either normal or flipped orientation, or at a distance of more than 1000 nucleotides from the 
promoter [Maniatis et al. (1987) Science 255:1237; Alberts et al. (1989) Molecular Biology of 
the Cell, 2nd ed.]. Enhancer elements derived from viruses may be particularly useful, because 
they usually have a broader host range. Examples include the S V40 early gene enhancer 
[Dijkema et al (1985) EMBO J. 4:761] and the enhancer/promoters derived from the long 
terminal repeat (LTR) of the Rous Sarcoma Virus [Gorman et al. (1982b) Proc. Natl Acad ScL 
79:6777] and from human cytomegalovirus [Boshart et al. (1985) Cell 47:521]. Additionally, 
some enhancers are regulatable and become active only in the presence of an inducer, such as a 
hormone or metal ion [Sassone-Corsi and Borelli (1986) Trends Genet 2:215; Maniatis et al. 
(1987) Science 236:1237]. 

A DNA molecule may be expressed intracellularly in mammalian cells. A promoter sequence 
may be directly linked with the DNA molecule, in which case the first amino acid at the N- 
terminus of the recombinant protein will always be a methionine, which is encoded by the ATG 
start codon. If desired, the N-terminus may be cleaved from the protein by in vitro incubation 
with cyanogen bromide. 

Alternatively, foreign proteins can also be secreted from the cell into the growth media by 
creating chimeric DNA molecules that encode a fusion protein comprised of a leader sequence 
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fragment that provides for secretion of the foreign protein in mammalian cells. Preferably, there 
are processing sites encoded between the leader fragment and the foreign gene that can be 
cleaved either in vivo or in vitro. The leader sequence fragment usually encodes a signal peptide 
comprised of hydrophobic amino acids which direct the secretion of the protein from the cell. 
The adenovirus triparite leader is an example of a leader sequence that provides for secretion of 
a foreign protein in mammalian cells. 

Usually, transcription termination and polyadenylation sequences recognized by mammalian 
cells are regulatory regions located 3' to the translation stop codon and thus, together with the 
promoter elements, flank the coding sequence. The 3* terminus of the mature m^NA is formed 
by site-specific post-transcriptional cleavage and polyadenylation [Birnstiel et al. (1985) Cell 
47:349; Proudfoot and Whitelaw (1988) "Termination and 3* end processing of eukaryotic 
RNA. In Transcription and splicing (ed. B.D. Hames and D.M. Glover); Proudfoot (1989) 
Trends Biochem. ScL 74:105]. These sequences direct the transcription of an mRNA which can 
be translated into the polypeptide encoded by the DNA. Examples of transcription 
terminater/polyadenylation signals include those derived from SV40 [Sambrook et al (1989) 
"Expression of cloned genes in cultured mammalian cells." In Molecular Cloning: A 
Laboratory Manual}. 

Usually, the above described components, comprising a promoter, polyadenylation signal, and 
transcription termination sequence are put together into expression constructs. Enhancers, 
introns with functional splice donor and acceptor sites, and leader sequences may also be 
included in an expression construct, if desired. Expression constructs are often maintained in a 
replicon, such as an extrachromosomal element (eg. plasmids) capable of stable maintenance in 
a host, such as mammalian cells or bacteria. Mammalian replication systems include those 
derived from animal virases, which require trans-acting factors to replicate. For example, 
plasmids containing the replication systems of papovaviruses, such as SV40 [Gluzman (1981) 
Cell 23:175] or polyomavirus, replicate to extremely high copy number in the presence of the 
appropriate viral T antigen. Additional examples of mammalian replicons include those derived 
from bovine papillomavirus and Epstein-Barr virus. Additionally, the replicon may have two 
replicaton systems, thus allowing it to be maintained, for example, in mammalian cells for 

58 




PATENT APPLICATION 
ATTY REF NO. 19195.002 

expression and in a prokaryotic host for cloning and amplification. Examples of such 
mammalian-bacteria shuttle vectors include pMT2 [Kaufman et al. (1989) Mol Cell Biol 
9:946] and pHEBO [Shimizu et al. (1986) Mol Cell Biol 6:1074]. 
The transformation procedure used depends upon the host to be transformed. Methods for 
introduction of heterologous polynucleotides into mammalian cells are known in the art and 
include dextran-mediated transfection, calcium phosphate precipitation, polybrene mediated 
transfection, protoplast fusion, electroporation, encapsulation of the polynucleotide(s) in 
liposomes, and direct microinjection of the DNA into nuclei. 

Mammalian cell lines available as hosts for expression are known in the art and include many 
immortalized cell lines available from the American Type Culture Collection (ATCC), 
including but not limited to, Chinese hamster ovary (CHO) cells, HeLa cells, baby hamster 
kidney (BHK) cells, monkey kidney cells (COS), human hepatocellular carcinoma cells {eg. 
Hep G2), and a number of other cell lines. 

ii. Baculovirus Systems 

The polynucleotide encoding the protein can also be inserted into a suitable insect expression 
vector, and is operably linked to the control elements within that vector. Vector construction 
employs techniques which are known in the art Generally, the components of the expression 
system include a transfer vector, usually a bacterial plasmid, which contains both a fragment of 
the baculovirus genome, and a convenient restriction site for insertion of the heterologous gene 
or genes to be expressed; a wild type baculovirus with a sequence homologous to the 
baculovirus-specific fragment in the transfer vector (this allows for the homologous 
recombination of the heterologous gene in to the baculovirus genome); and appropriate insect 
host cells and growth media. 

After inserting the DNA sequence encoding the protein into the transfer vector, the vector and 
the wild type viral genome are transfected into an insect host cell where the vector and viral 
genome are allowed to recombine. The packaged recombinant virus is expressed and 
recombinant plaques are identified and purified. Materials and methods for baculovirus/insect 
cell expression systems are commercially available in kit form from, inter alia, Invitrogen, San 
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Diego CA ("MaxBac" kit). These techniques are generally known to those skilled in the art and 
frilly described in Summers & Smith, Texas Agricultural Experiment Station Bulletin No. 1555 
(1987) ("Summers & Smith"). 

Prior to inserting the DNA sequence encoding the protein into the baculovinis genome, the 
above described components, comprising a promoter, leader (if desired), coding sequence, and 
transcription termination sequence, are usually assembled into an intermediate transplacement 
construct (transfer vector). This may contain a single gene and operably linked regulatory 
elements; multiple genes, edch with its owned set of operably linked regulatory elements; or 
multiple genes, regulated by the same set of regulatory elements. Intermediate transplacement 
constructs are often maintained in a replicon, such as an extra-chromosomal element (e.g. 
plasmids) capable of stable maintenance in a host, such as a bacterium. The replicon will have a 
replication system, thus allowing it to be maintained in a suitable host for cloning and 
amplification. 

Currendy, the most commonly used transfer vector for introducing foreign genes into AcNPV is 
pAc373. Many other vectors, known to those of skill in the art, have also been designed. These 
include, for example, pVL985 (which alters the polyhedrin start codon from ATG to ATT, and 
which introduces a BamHI cloning site 32 basepairs downstream from the ATT; see Luckow 
and Summers, Virology (1989) /7:31. 

Theplasmid usually also contains the polyhedrin polyadenylation signal (Miller et at. (1988) 
Ann. Rev. Microbiol, 42:111) and a prokaryotic ampicillin-resistance (amp) gene and origin of 
replication for selection and propagation in E. coll 

Baculovinis transfer vectors usually contain a baculovinis promoter. A baculovinis promoter is 
any DNA sequence capable of binding a baculovinis RNA polymerase and initiating the 
downstream (5 1 to 3*) transcription of a coding sequence (eg. structural gene) into mRNA. A 
promoter will have a transcription initiation region which is usually placed proximal to die 5 f 
end of the coding sequence. This transcription initiation region usually includes an RNA 
polymerase binding site and a transcription initiation site. A baculovinis transfer vector may 
also have a second domain called an enhancer, which, if present, is usually distal to the 
structural gene. Expression may be either regulated or constitutive. 
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Structural genes, abundantly transcribed at late times in a viral infection cycle, provide 
particularly useful promoter sequences. Examples include sequences derived from the gene 
encoding the viral polyhedron protein, Friesen et aL, (1986) "The Regulation of Baculovirus 
Gene Expression,"* in: The Molecular Biology ofBacubviruses (ed. Walter Doerfler); EPO 
Publ. Nos. 127 839 and 155 476; and the gene encoding the plO protein, Vlak et aL, (1988), /. 
Gen. ViroL 69:765. 

DNA encoding suitable signal sequences can be derived from genes for secreted insect or 
baculovirus proteins, such as the baculovirus polyhedrin gene (Carbonell et al. (1988) Gene, 
73:409). Alternatively, since the signals for mammalian cell posttranslational modifications 
(such as signal peptide cleavage, proteolytic cleavage, and phosphorylation) appear to be 
recognized by insect cells, and the signals required for secretion and nuclear accumulation also 
appear to be conserved between the invertebrate cells and vertebrate cells, leaders of non-insect 
origin, such as those derived from genes encoding human O-interferon, Maeda et aL, (1985), 
Nature 3/5:592; human gastrin-releasing peptide, Lebacq-Verheyden et aL. (1988), Molec. 
Cell Biol 8:3129; human 11^2, Smith et al., (1985) Proc. Natl Acad. ScL USA, 82:8404; 
mouse IL-3, (Miyajima et aL, (1987) Gene 58:273; and human glucocerebrosidase, Martin et al. 
(1988) DNA, 7:99, can also be used to provide for secretion in insects. 
A recombinant polypeptide or polyprotein may be expressed intracellularly or, if it is expressed 
with the proper regulatory sequences, it can be secreted. Good intracellular expression of 
nonfused foreign proteins usually requires heterologous genes that ideally have a short leader 
sequence containing suitable translation initiation signals preceding an ATG start signal. If 
desired, methionine at the N-terminus may be cleaved from the mature protein by in vitro 
incubation with cyanogen bromide. 

Alternatively, recombinant polyproteins or proteins which are not naturally secreted can be 
secreted from the insect cell by creating chimeric DNA molecules that encode a fusion protein 
comprised of a leader sequence fragment mat provides for secretion of the foreign protein in 
insects. The leader sequence fragment usually encodes a signal peptide comprised of 
hydrophobic amino acids which direct the translocation of the protein into the endoplasmic 
reticulum. 
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After insertion of the DNA sequence and/or the gene encoding the expression product precursor 
of the protein, an insect cell host is co-transformed with the heterologous DNA of the transfer 
vector and the genomic DNA of wild type baculovirus - usually by co-transfection. The 
promoter and transcription termination sequence of the construct will usually comprise a 2-5kb 
section of the baculovirus genome. Methods for introducing heterologous DNA into the desired 
site in the baculovirus virus are known in the art (See Summers & Smith supra\ Ju et al. 
(1987); Smith et al., Mol Cell Biol (1983) 3:2156; and Luckow and Summers (1989)). For 
example, the insertion can be into a gene such as the polyhedrin gene, by homologous double 
crossover recombination; insertion can also be into a restriction enzyme site engineered into the 
desired baculovirus gene. Miller et al., (1989), Bioessays 4:91 .The DNA sequence, when cloned 
in place of the polyhedrin gene in the expression vector, is flanked both 5' and 3' by polyhedrin- 
specific sequences and is positioned downstream of the polyhedrin promoter. 
The newly formed baculovirus expression vector is subsequently packaged into an infectious 
recombinant baculovirus. Homologous recombination occurs at low frequency (between about 
1% and about 5%); thus, the majority of the virus produced after cotransfection is still wild-type 
virus. Therefore, a method is necessary to identify recombinant viruses. An advantage of the 
expression system is a visual screen allowing recombinant viruses to be distinguished. Hie 
polyhedrin protein, which is produced by the native virus, is produced at very high levels in the 
nuclei of infected cells at late times after viral infection. Accumulated polyhedrin protein forms 
occlusion bodies that also contain embedded particles. These occlusion bodies, up to 15 Jim in 
size, are highly refractile, giving them a bright shiny appearance that is readily visualized under 
the light microscope. Cells infected with recombinant viruses lack occlusion bodies. To 
distinguish recombinant virus from wild-type virus, the transfection supernatant is plaqued onto 
a monolayer of insect cells by techniques known to those skilled in the art Namely, the plaques 
are screened under the light microscope for the presence (indicative of wild-type virus) or 
absence (indicative of recombinant virus) of occlusion bodies. "Current Protocols in 
Microbiology" Vol. 2 (Ausubel et al. eds) at 16.8 (Supp. 10, 1990); Summers & Smith, supra; 
Miller etal. (1989). 
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Recombinant baculovirus expression vectors have been developed for infection into several 
insect cells. For example, recombinant baculoviruses have been developed for, inter alia: Aedes 
aegypti , Autographa californica, Bombyxmori, Drosophila melanogaster % Spodoptera 
frugiperda, and Trichoplusia ni (WO 89/046699; Carbonell et aL, (1985) /. Virol 56:153; 
Wright (1986) Nature 327:718; Smith et aL, (1983) Mol Cell Biol 3:2156; and see generally, 
Fraser, et al (1989) In Vitro Cell Dev. Biol 25:225). 

Cells and cell culture media are commercially available for both direct and fusion expression of 
heterologous polypeptides in a baculovirus/expression system; cell culture technology is 
generally known to those skilled in the art. See, eg. Summers & Smith supra. 
The modified insect cells may then be grown in an appropriate nutrient medium, which allows 
for stable maintenance of the plasmid(s) present in the modified insect host Where the 
expression product gene is under inducible control, the host may be grown to high density, and 
expression induced. Alternatively, where expression is constitutive, the product will be 
continuously expressed into the medium and the nutrient medium must be continuously 
circulated, while removing the product of interest and augmenting depleted nutrients. The 
product may be purified by such techniques as chromatography, eg. HPLC, affinity 
chromatography, ion exchange chromatography, etc.; electrophoresis; density gradient 
centrifugation; solvent extraction, etc. As appropriate, the product may be further purified, as 
required, so as to remove substantially any insect proteins which are also present in the 
medium, so as to provide a product which is at least substantially free of host debris, eg. 
proteins, lipids and polysaccharides. 

In order to obtain protein expression, recombinant host cells derived from the transfonnants are 
incubated under conditions which allow expression of the recombinant protein encoding 
sequence. These conditions will vary, dependent upon the host cell selected. However, the 
conditions are readily ascertainable to those of ordinary skill in the art, based upon what is 
known in the art 
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iii. Plant Systems 

There are many plant cell culture and whole plant genetic expression systems known in the art 
Exemplary plant cellular genetic expression systems include those described in patents, such as: 
US 5,693,506; US 5,659,122; and US 5,608,143. Additional examples of genetic expression in 
plant cell culture has been described by Zenk, Phytochemistry 30:3861-3863 (1991). 
Descriptions of plant protein signal peptides may be found in addition to the references 
described above in Vaulcombe et al., Mol Gen. Genet 209:33-40 (1987); Chandler et al., Plant 
Molecular Biology 3:407-418 (1984); Rogers, J. Biol Chem. 260:3731-3738 (1985); Rothstein 
et al., Gene 55:353-356 (1987); Whittier et al., Nucleic Acids Research 15:2515-2535 (1987); 
Wirsel et al., Molecular Microbiology 3:3-14 (1989); Yu et al., Gene 122:247-253 (1992). A 
description of the regulation of plant gene expression by the phytohormone, gibberellic acid and 
secreted enzymes induced by gibberellic acid can be found in R.L. Jones and J. MacMillin, 
Gibberellins: in: Advanced Plant Physiology,. Malcolm B. Wilkins, ed., 1984 Pitman 
Publishing Limited, London, pp. 21-52. References that describe other metabolically-regulated 
genes: Sheen, Plant Cell, 2:1027-1038(1990); Maas et al., EMBO J. 9:3447-3452 (1990); 
Benkel andHickey, Proc. Natl Acad. ScL 84:1337-1339 (1987). 

Typically, using techniques known in the art, a desired polynucleotide sequence is inserted into 
an expression cassette comprising genetic regulatory elements designed for operation in plants. 
The expression cassette is inserted into a desired expression vector with companion sequences 
upstream and downstream from the expression cassette suitable for expression in a plant host 
Hie companion sequences will be of plasmid or viral origin and provide necessary 
characteristics to the vector to permit the vectors to move DNA from an original cloning host, 
such as bacteria, to the desired plant host The basic bacterial/plant vector construct will 
preferably provide a broad host range prokaryote replication origin; a prokaryote selectable 
marker, and, for Agrobacterium transformations, T DNA sequences for Agrobacterium- 
mediated transfer to plant chromosomes. Where the heterologous gene is not readily amenable 
to detection, the construct will preferably also have a selectable marker gene suitable for 
determining if a plant cell has been transformed. A general review of suitable markers, for 
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example for the members of the grass family, is found in Wilmink and Dons, 1993, Plant Mol 
Biol. Reptr, 11(2):165-185. 

Sequences suitable for permitting integration of the heterologous sequence into the plant 
genome are also recommended. These might include transposon sequences and the like for 
homologous recombination as well as Ti sequences which permit random insertion of a 
heterologous expression cassette into a plant genome. Suitable prokaryote selectable markers 
include resistance toward antibiotics such as ampicillin or tetracycline. Other DNA sequences 
encoding additional functions may also be present in the vector, as is known in the art. 
The nucleic acid molecules of the subject invention may be included into an expression cassette 
for expression of the protein(s) of interest. Usually, there will be only one expression cassette, 
although two or more are feasible. The recombinant expression cassette will contain in addition 
to the heterologous protein encoding sequence the following elements, a promoter region, plant 
5* untranslated sequences, initiation codon depending upon whether or not the structural gene 
comes equipped with one, and a transcription and translation termination sequence. Unique 
restriction enzyme sites at the 5' and 3' ends of the cassette allow for easy insertion into a pre- 
existing vector. 

A heterologous coding sequence may be for any protein relating to the present invention. The 
sequence encoding the protein of interest will encode a signal peptide which allows processing 
and translocation of the protein, as appropriate, and will usually lack any sequence which might 
result in the binding of the desired protein of the invention to a membrane. Since, for the most 
part, the transcriptional initiation region will be for a gene which is expressed and translocated 
during germination, by employing the signal peptide which provides for translocation, one may 
also provide for translocation of the protein of interest In this way, the protein(s) of interest 
will be translocated from the cells in which they are expressed and may be efficiently harvested. 
Typically secretion in seeds are across the aleurone or scutellar epithelium layer into the 
endosperm of the seed. While it is not required that the protein be secreted from the cells in 
which the protein is produced, this facilitates the isolation and purification of the recombinant 
protein. 
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Since the ultimate expression of the desired gene product will be in a eucaryotic cell it is 
desirable to determine whether any portion of the cloned gene contains sequences which will be 
processed out as introns by the hosfs splicosome machinery. If so, site-directed mutagenesis of 
the "intron" region may be conducted to prevent losing a portion of the genetic message as a 
false intron code, Reed and Maniatis, Cell 4 1:95-105, 1985. 
The vector can be microinjected directly into plant cells by use of micropipettes to 
mechanically transfer the recombinant DNA. Crossway, Mol Gen. Genet, 202:179-185, 1985. 
Hie genetic material may also be transferred into the plant cell by using polyethylene glycol, 
Krens, et al., Nature, 296, 72-74, 1982. Another method of introduction of nucleic acid 
segments is high velocity ballistic penetration by small particles with the nucleic acid either 
within the matrix of small beads or particles, or on the surface, Klein, et aL, Nature, 327, 70-73, 
1987 and Knudsen and Muller, 1991, Planta, 185:330-336 teaching particle bombardment of 
barley endosperm to create transgenic barley. Yet another method of introduction would be 
fusion of protoplasts with other entities, either minicells, cells, lysosomes or other fusible lipid- 
surfaced bodies, Fraley, et al., Proc. Natl Acad. Scu USA, 79, 1859-1863, 1982. 
Ihe vector may also be introduced into the plant cells by electroporation. (Fromm et al., Proc. 
Natl Acad ScL USA 82:5824, 1985). In this technique, plant protoplasts are electroporated in 
the presence of plasmids containing the gene construct Electrical impulses of high field 
strength reversibly permeabilize biomembranes allowing the introduction of the plasmids. 
Electroporated plant protoplasts reform the cell wall, divide, and form plant callus. 
All plants from which protoplasts can be isolated and cultured to give whole regenerated plants 
can be transformed by the present invention so that whole plants are recovered which contain 
the transferred gene. It is known that practically all plants can be regenerated from cultured 
cells or tissues, including but not limited to all major species of sugarcane, sugar beet, cotton, 
fhiit and other trees, legumes and vegetables. Some suitable plants include, for example, species 
from the genera Fragaria, Lotus, Medicago, Onobrychis, Trifolium, Trigonella, Vigna, Citrus, 
Linum, Geranium, Manihot, Daucus, Arabidopsis, Brassica, Raphanus, Sinapis, Atropa, 
Capsicum, Datura, Hyoscyamus, Lycopersion, Nicotians Solanum, Petunia, Digitalis, 
Majorana, Cichorium, Helianthus, Lactuca, Bromus, Asparagus, Antirrhinum, Hererocallis, 
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Nemesia, Pelargonium, Panicum, Pennisetum, Ranunculus, Senecio, Salpiglossis, Cucumis, 
Browaalia, Glycine, Lolium, Zea, Triticum, Sorghum, and Datura. 
Means for regeneration vary from species to species of plants, but generally a suspension of 
transformed protoplasts containing copies of the heterologous gene is first provided. Callus 
tissue is formed and shoots may be induced from callus and subsequently rooted. Alternatively, 
embryo formation can be induced from the protoplast suspension. These embryos germinate as 
natural embryos to form plants. The culture media will generally contain various amino acids 
and hormones, such as auxin and cy tokinins. It is also advantageous to add glutamic acid and 
proline to the medium, especially for such species as corn and alfalfa. Shoots and roots 
normally develop simultaneously. Efficient regeneration will depend on the medium, on the 
genotype, and on the history of the culture. If these three variables are controlled, then 
regeneration is fully reproducible and repeatable. 

In some plant cell culture systems, the desired protein of the invention may be excreted or 
alternatively, the protein may be extracted from the whole plant Where the desired protein of 
the invention is secreted into the medium, it may be collected. Alternatively, the embryos and 
embryoless-half seeds or other plant tissue may be mechanically disrupted to release any 
secreted protein between cells and tissues. The mixture may be suspended in a buffer solution 
to retrieve soluble proteins. Conventional protein isolation and purification methods will be 
then used to purify the recombinant protein. Parameters of time, temperature pH, oxygen, and 
volumes will be adjusted through routine methods to optimize expression and recovery of 
heterologous protein. 

iv. Bacterial Systems 

Bacterial expression techniques are known in the art. A bacterial promoter is any DNA 
sequence capable of binding bacterial RNA polymerase and initiating the downstream (3*) 
transcription of a coding sequence (eg. structural gene) into mRNA. A promoter will have a 
transcription initiation region which is usually placed proximal to the 5' end of the coding 
sequence. This transcription initiation region usually includes an RNA polymerase binding site 
and a transcription initiation site. A bacterial promoter may also have a second domain called 
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an operator, that may overlap an adjacent RNA polymerase binding site at which RNA 
synthesis begins. The operator permits negative regulated (inducible) transcription, as a gene 
repressor protein may bind the operator and thereby inhibit transcription of a specific gene. 
Constitutive expression may occur in the absence of negative regulatory elements, such as the 
operator. In addition, positive regulation may be achieved by a gene activator protein binding 
sequence, which, if present is usually proximal (5') to the RNA polymerase binding sequence. 
An example of a gene activator protein is the catabolite activator protein (CAP), which helps 
initiate transcription of the lac operon in Escherichia coli (K coli) [Raibaud etal. (1984) Annu. 
Rev. Genet. 18: 173], Regulated expression may therefore be either positive or negative, thereby 
either enhancing or reducing transcription. 

Sequences encoding metabolic pathway enzymes provide particularly useful promoter 
sequences. Examples include promoter sequences derived from sugar metabolizing enzymes, 
such as galactose, lactose (lac) [Chang etal (1977) Nature 198: 10561, and maltose. Additional 
examples include promoter sequences derived from biosynthetic enzymes such as tryptophan 
(trp) [Goeddel et at (1980) Nuc. Acids Res. 5:4057; Yelverton et al (1981) NucL Acids Res. 
9:731; US patent 4,738,921; EP-A-0036776 and EP-A-0121775]. The g-laotamase (bla) 
promoter system [Weissmann (1981) "The cloning of interferon and other mistakes." In 
Interferon 3 (ed. I. Gresser)], bacteriophage lambda PL [Shimatake et al. (1981) Nature 
292:128] and T5 [US patent 4,689,406] promoter systems also provide useful promoter 
sequences. 

In addition, synthetic promoters which do not occur in nature also function as bacterial 
promoters. For example, transcription activation sequences of one bacterial or bacteriophage 
promoter may be joined with the operon sequences of another bacterial or bacteriophage 
promoter, creating a synthetic hybrid promoter [US patent 4,551,433]. For example, the tac 
promoter is a hybrid trp~lac promoter comprised of both trp promoter and lac operon sequences 
that is regulated by the lac repressor [Amann et al. (1983) Gene 25:167; de Boer et al (1983) 
Proa NatL Acad ScL 50:21]. Furthermore, a bacterial promoter can include naturally occurring 
promoters of non-bacterial origin that have the ability to bind bacterial RNA polymerase and 
initiate transcription. A naturally occurring promoter of non-bacterial origin can also be coupled 
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with a compatible RNA polymerase to produce high levels of expression of some genes in 
prokaryotes. The bacteriophage T7 UNA polymerase/promoter system is an example of a 
coupled promoter system [Studier et al (1986) /. Mol Biol 7*9:113; Tabor etal (1985) Proc 
Natl Acad. Set 52:1074]. In addition, a hybrid promoter can also be comprised of a 
bacteriophage promoter and an E. coli operator region (EPO-A-0 267 851). 
In addition to a functioning promoter sequence, an efficient ribosome binding site is also useful 
for the expression of foreign genes in prokaryotes. In E. coli, the ribosome binding site is called 
the Shine-Dalgarno (SD) sequence and includes an initiation codon (ATG) and a sequence 3-9 
nucleotides in length located 3-1 1 nucleotides upstream of the initiation codon [Shine etal 
(1975) Nature 254:34]. The SD sequence is thought to promote binding of mRNA to the 
ribosome by the pairing of bases between the SD sequence and the 3' and of E. coli 16S rRNA 
[Steitz etal (1979) "Genetic signals and nucleotide sequences in messenger RNA." In 
Biological Regulation and Development: Gene Expression (ed. R J 5 . Goldberger)]. To express 
eukaryotic genes and prokaryotic genes with weak ribosome-binding site [Sambrook etal 
(1989) "Expression of cloned genes in Escherichia coli." In Molecular Cloning: A Laboratory 
Manual]. 

A DNA molecule may be expressed intracellular^ . A promoter sequence may be directly 
linked with the DNA molecule, in which case the first amino acid at the N-terminus will always 
be a methionine, which is encoded by the ATG start codon. If desired, methionine at the N- 
terminus may be cleaved from the protein by in vitro incubation with cyanogen bromide or by 
either in vivo on in vitro incubation with a bacterial methionine N-terminal peptidase (EPO-A-0 
219237). 

Fusion proteins provide an alternative to direct expression. Usually, a DNA sequence encoding 
the N-terminal portion of an endogenous bacterial protein, or other stable protein, is fused to the 
5* end of heterologous coding sequences. Upon expression, this construct will provide a fusion 
of the two amino acid sequences. For example, the bacteriophage lambda cell gene can be 
linked at the 5* terminus of a foreign gene and expressed in bacteria. The resulting fusion 
protein preferably retains a site for a processing enzyme (factor Xa) to cleave the bacteriophage 
protein from the foreign gene [Nagai et al (1984) Nature 309:810]. Fusion proteins can also be 
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made with sequences from the lacZ [Jia etal (1987) Gene 60:197], trpE [Allen et al. (1987) J. 
Biotectmol. 5:93; Makoff et al (1989) /. Gen. Microbiol 135:11], and Chey [EP-A-0 324 647] 
genes. The DNA sequence at the junction of the two amino acid sequences may or may not 
encode a cleavable site. Another example is a ubiquitin fiision protein. Such a fusion protein is 
made with the ubiquitin region that preferably retains a site for a processing enzyme (eg. 
ubiquitin specific processing-protease) to cleave the ubiquitin from the foreign protein. Through 
this method, native foreign protein can be isolated [Miller et al (1989) Bio/Technology 7:698]. 
Alternatively, foreign proteins can also be secreted from the cell by creating chimeric DNA 
molecules that encode a fusion protein comprised of a signal peptide sequence fragment that 
provides for secretion of the foreign protein in bacteria [US patent 4,336,336]. The signal 
sequence fragment usually encodes a signal peptide comprised of hydrophobic amino acids 
which direct the secretion of the protein from the cell. The protein is either secreted into the 
growth media (gram-positive bacteria) or into the periplasmic space, located between the inner 
and outer membrane of the cell (gram-negative bacteria). Preferably there are processing sites, 
which can be cleaved either in vivo or in vitro encoded between the signal peptide fragment and 
the foreign gene. 

DNA encoding suitable signal sequences can be derived from genes for secreted bacterial 
proteins, such as the E. coli outer membrane protein gene (ompA) [Masui et al (1983), in: 
Experimental Manipulation of Gene Expression; Ghrayeb et al (1984) EMBO J. J:2437] and 
the E. coli alkaline phosphatase signal sequence (phoA) [Oka etal (1985) Proc. Natl Acad. 
Set $2:7212]. As an additional example, the signal sequence of the alpha-amylase gene from 
various Bacillus strains can be used to secrete heterologous proteins from B. subtilis [Palva et 
al (1982) Proc Natl Acad. ScL USA 79:5582; EP-A-0 244 042], 
Usually, transcription termination sequences recognized by bacteria are regulatory regions 
located 3' to the translation stop codon, and thus together with the promoter flank the coding 
sequence. These sequences direct the transcription of an mRNA which can be translated into the 
polypeptide encoded by the DNA. Transcription termination sequences frequently include DNA 
sequences of about 50 nucleotides capable of forming stem loop structures that aid in 
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terminating transcription. Examples include transcription termination sequences derived from 
genes with strong promoters, such as the trp gene in E. coli as well as other biosynthetic genes. 
Usually, the above described components, comprising a promoter, signal sequence (if desired), 
coding sequence of interest, and transcription termination sequence, are put together into 
expression constructs. Expression constructs are often maintained in a replicon, such as an 
extrachromosomal element (eg. plasmids) capable of stable maintenance in a host, such as 
bacteria. The replicon will have a replication system, thus allowing it to be maintained in a 
prokaryotic host either for expression or for cloning and amplification. In addition, a replicon 
may be either a high or low copy number plasmid. A high copy number plasmid will generally 
have a copy number ranging from about 5 to about 200, and usually about 10 to about 150. A 
host containing a high copy number plasmid will preferably contain at least about 10, and more 
preferably at least about 20 plasmids. Either a high or low copy number vector may be selected, 
depending upon the effect of the vector and the foreign protein on the host 
Alternatively, the expression constructs can be integrated into the bacterial genome with an 
integrating vector. Integrating vectors usually contain at least one sequence homologous to the 
bacterial chromosome that allows the vector to integrate. Integrations appear to result from 
recombinations between homologous DNA fa the vector and the bacterial chromosome. For 
example, integrating vectors constructed with DNA from various Bacillus strains integrate into 
the Bacillus chromosome (EP-A- 0 127 328). Integrating vectors may also be comprised of 
bacteriophage or transposon sequences. 

Usually, extrachromosomal and integrating expression constructs may contain selectable 
markers to allow for the selection of bacterial strains that have been transformed. Selectable 
markers can be expressed in the bacterial host and may include genes which render bacteria 
resistant to drugs such as ampicillin, chloramphenicol, erythromycin, kanamycin (neomycin), 
and tetracycline [Davies et al (1978) Annu. Rev. Microbiol 52:469]. Selectable markers may 
also include biosynthetic genes, such as those in the histidine, tryptophan, and leucine 
biosynthetic pathways. 
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Alternatively, some of the above described components can be put together in transformation 
vectors. Transformation vectors are usually comprised of a selectable market that is either 
maintained in a replicon or developed into an integrating vector, as described above. 
Expression and transformation vectors, either extra-chromosomal replicons or integrating 
vectors, have been developed for transformation into many bacteria. For example, expression 
vectors have been developed for, inter alia, the following bacteria: Bacillus subtilis [Palva et al 
(1982) Proc. Natl Acad. Scl USA 79:5582; EP-A-0 036 259 and EP-A-0 063 953; WO 
84/04541], Escherichia coli [Shimatake et al. (1981) Nature 292:128; Amann et al (1985) 
Gene 40:1 83; Studier et al. (1986) J. Mol Biol 189:113; EP-A-0 036 776JEP-A-0 136 829 and 
EP-A-0 136 907], Streptococcus cremoris [Powell etal (\9%%)Appl Environ. Microbiol 
54:655]; Streptococcus lividans [Powell et al. (1988) Appl Environ. Microbiol 54:655], 
Streptomyces lividans [US patent 4/745,056]. 

Methods of introducing exogenous DNA into bacterial hosts are well-known in the art, and 
usually include either the transformation of bacteria treated with CaCb or other agents, such as 
divalent cations and DMSO. DNA can also be introduced into bacterial cells by electroporation. 
Transformation procedures usually vary with the bacterial species to be transformed. See eg. 
[Masson et al (1989) FEMS Microbiol Lett 60:273; Palva et al. (1982) Proc. Natl Acad Sci. 
USA 79:5582; EP-A-0 036 259 and EP-A-0 063 953; WO 84/04541, Bacillus], [Miller etal 
(1988) Proc. Natl Acad. Scl 85:856; Wang etal. (1990) / Bacteriol 772:949, 
Campylobacter], [Cohen etal. (1973) Proa Natl Acad Scl 69:2110; Dower etal (1988) 
Nucleic Acids Res. 76:6127; Kushner (1978) "An improved method for transformation of 
Escherichia coli with ColEl-derived plasmids. In Genetic Engineering: Proceedings of the 
International Symposium on Genetic Engineering (eds. H.W. Boyer and S. Nicosia); Mandel et 
al. (1970) J. Mol Biol. 55:159; Taketo (1988) Biochim. Biophys. Acta 949:318; Escherichia], 
[Chassy etal. (1987) FEMS Microbiol Lett. 44:173 Lactobacillus]; [Fiedler etal. (l9%S)Anal 
Biochem 770:38, Pseudomonas]; [Augustin etal (1990) FEMS Microbiol Lett 66:203, 
Staphylococcus], [Barany etal (1980) /. Bacteriol 744:698; Harlander (1987) "Transformation 
of Streptococcus lactis by electroporation, in: Streptococcal Genetics (ed. J. Ferretti and R. 
Curtiss HI); Perry et al. (1981) Infect Immun. 32:1295; Powell etal. (1988) Appl Environ. 
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Microbiol 54:655; Somkuti et al (1987) Proc 4th Evr. Cong. Biotechnology 7:412, 
Streptococcus]. 

v. Yeast Expression 

Yeast expression systems are also known to one of ordinary skill in the art A yeast promoter is 
any DNA sequence capable of binding yeast RNA polymerase and initiating the downstream 
(3*) transcription of a coding sequence (eg. structural gene) into mRNA. A promoter will have a 
transcription initiation region which is usually placed proximal to the 5' end of the coding 
sequence. This transcription initiation region usually includes an RNA polymerase binding site 
(the "TATA Box") and a transcription initiation site. A yeast promoter may also have a second 
domain called an upstream activator sequence (UAS), which, if present, is usually distal to the 
structural gene. Hie UAS permits regulated (inducible) expression. Constitutive expression 
occurs in the absence of a UAS. Regulated expression may be either positive or negative, 
thereby either enhancing or reducing transcription. 

Yeast is a fermenting organism with an active metabolic pathway, therefore sequences encoding 
enzymes in the metabolic pathway provide particularly useful promoter sequences. Examples 
include alcohol dehydrogenase (ADH) (EP-A-0 284 044), enolase, glucokinase, glucoses- 
phosphate isomerase, glyceraldehyde-3-phosphate-dehydrogenase (GAP or GAPDH), 
hexokinase, phosphofructokinase, 3-phosphoglycerate mutase, and pyruvate kinase (PyK) 
(EPO-A-0 329 203). The yeast PH05 gene, encoding acid phosphatase, also provides useful 
promoter sequences [Myanohara et al (1983) Proc. Natl Acad Sci USA 80:1], 
In addition, synthetic promoters which do not occur in nature also function as yeast promoters. 
For example, UAS sequences of one yeast promoter may be joined with the transcription 
activation region of another yeast promoter, creating a synthetic hybrid promoter. Examples of 
such hybrid promoters include the ADH regulatory sequence linked to the GAP transcription 
activation region (US Patent Nos. 4,876, 197 and 4,880,734). Other examples of hybrid 
promoters include promoters which consist of the regulatory sequences of either the ADH2, 
GAL4 t GAL10, OR PHOS genes, combined with the transcriptional activation region of a 
glycolytic enzyme gene such as GAP or PyK (EP-A-0 164 556), Furthermore, a yeast promoter 
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can include naturally occurring promoters of non-yeast origin that have the ability to bind yeast 
RNA polymerase and initiate transcription. Examples of such promoters include, inter alia, 
[Cohen etal (1980) Proc. NatL Acad Sci USA 77:1078; Henikoff et al (19H)Nature 
235:835; Hollenberg etal (1981) Curr. Topics Microbiol Immunol 96:1 19; Hollenberg et al. 
(1979) "The Expression of Bacterial Antibiotic Resistance Genes in the Yeast Saccharomyces 
cerevisiae," in: Plasmids of Medical, Environmental and Commercial Importance (eds. K.N. 
Timmis and A. Puhler); Mercerau-Puigalon etal (1980) Gene 7i:163; Panthier etal (1980) 
Curr. Genet 2:109;]. 

A DNA molecule may be expressed intracellularly in yeast. A promoter sequence may be 
directly linked with the DNA molecule, in which case the first amino acid at the N-terminus of 
the recombinant protein will always be a methionine, which is encoded by the ATG start codon. 
If desired, methionine at the N-terminus may be cleaved from the protein by in vitro incubation 
with cyanogen bromide. 

Fusion proteins provide an alternative for yeast expression systems, as well as in mammalian, 
baculovirus, and bacterial expression systems. Usually, a DNA sequence encoding the N- 
terminal portion of an endogenous yeast protein, or other stable protein, is fused to the 5* end of 
heterologous coding sequences. Upon expression, this construct will provide a fusion of the two 
amino acid sequences. For example, the yeast or human superoxide dismutase (SOD) gene, can 
be linked at the 5' terminus of a foreign gene and expressed in yeast The DNA sequence at the 
junction of the two amino acid sequences may or may not encode a cleavable site. See eg. EP- 
A-0 196 056. Another example is a ubiquitin fiision protein. Such a fusion protein is made with 
the ubiquitin region that preferably retains a site for a processing enzyme (eg. ubiquitin-specific 
processing protease) to cleave the ubiquitin from the foreign protein. Through this method, 
therefore, native foreign protein can be isolated (eg. WO88/024066). 
Alternatively, foreign proteins can also be secreted from the cell into the growth media by 
creating chimeric DNA molecules that encode a fusion protein comprised of a leader sequence 
fragment that provide for secretion in yeast of the foreign protein. Preferably, there are 
processing sites encoded between the leader fragment and the foreign gene that can be cleaved 
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either in vivo or in vitro. The leader sequence fragment usually encodes a signal peptide 
comprised of hydrophobic amino acids which direct the secretion of the protein from the cell 
DNA encoding suitable signal sequences can be derived from genes for secreted yeast proteins, 
such as the yeast invertase gene (EP-A-0 012 873; JPO. 62,096,086) and the A-factor gene (US 
patent 4,588,684). Alternatively, leaders of non-yeast origin, such as an interferon leader, exist 
that also provide for secretion in yeast (EP-A-0 060 057). 

A preferred class of secretion leaders are those that employ a fragment of the yeast alpha-factor 
gene, which contains both a "pre" signal sequence, and a "pro" region. The types of alpha-factor 
fragments that can be employed include the full-length pre-pro alpha factor leader (about 83 
amino acid residues) as well as truncated alpha-factor leaders (usually about 25 to about 50 
amino acid residues) (US Patents 4,546,083 and 4,870,008; EP-A-0 324 274). Additional 
leaders employing an alpha-factor leader fragment that provides for secretion include hybrid 
alpha-factor leaders made with a presequence of a first yeast, but a pro-region from a second 
yeast alphafactor. (eg. see WO 89/02463.) 

Usually, transcription termination sequences recognized by yeast are regulatory regions located 
3' to the translation stop codon, and thus together with the promoter flank the coding sequence. 
These sequences direct the transcription of an mRNA which can be translated into the 
polypeptide encoded by the DNA. Examples of transcription terminator sequence and other 
yeast-recognized termination sequences, such as those coding for glycolytic enzymes. 
Usually, the above described components, comprising a promoter, leader (if desired), coding 
sequence of interest, and transcription termination sequence, are put together into expression 
constructs. Expression constructs are often maintained in a replicon, such as an 
extrachromosomal element (eg. plasmids) capable of stable maintenance in a host, such as yeast 
or bacteria. The replicon may have two replication systems, thus allowing it to be maintained, 
for example, in yeast for expression and in a prokaryotic host for cloning and amplification. 
Examples of such yeast-bacteria shuttle vectors include YEp24 [Botstein etal (1979) Gene 
5:17-24], pCl/1 [Brake etal (1984) Proc. Natl Acad. Sci USA 37:4642-4646], and YRpl7 
[Stinchcomb et al (1982) L Mol Biol 158:157]. In addition, a replicon may be either a high or 
low copy number plasmid. A high copy number plasmid will generally have a copy number 
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ranging from about 5 to about 200, and usually about 10 to about 150. A host containing a high 
copy number plasmid will preferably have at least about 10, and more preferably at least about 
20. Enter a high or low copy number vector may be selected, depending upon the effect of the 
vector and the foreign protein on the host See eg. Brake et al, supra. 
Alternatively, the expression constructs can be integrated into the yeast genome with an 
integrating vector. Integrating vectors usually contain at least one sequence homologous to a 
yeast chromosome that allows the vector to integrate, and preferably contain two homologous 
sequences flanking the expression construct Integrations appear to result from recombinations 
between homologous DNA in the vector and the yeast chromosome [Orr-Weaver et al (1983) 
Methods in EnzymoL 101 :228-245]. An integrating vector may be directed to a specific locus in 
yeast by selecting the appropriate homologous sequence for inclusion in the vector. See Oit- 
Weaver et al, supra. One or more expression construct may integrate, possibly affecting levels 
of recombinant protein produced [Rine et al (1983) Proc. Natl Acad. Sci USA 50:6750]. The 
chromosomal sequences included in the vector can occur either as a single segment in the 
vector, which results in the integration of the entire vector, or two segments homologous to 
adjacent segments in the chromosome and flanking the expression construct in the vector, 
which can result in the stable integration of only the expression construct 
Usually, extrachromosomal and integrating expression constructs may contain selectable 
markers to allow for the selection of yeast strains that have been transformed. Selectable 
markers may include biosynthetic genes that can be expressed in the yeast host, such as ADE2, 
HIS4, LEU2, TRP1, and ALG7, and the G418 resistance gene, which confer resistance in yeast 
cells to tunicamycin and G418, respectively. In addition, a suitable selectable marker may also 
provide yeast with the ability to grow in the presence of toxic compounds, such as metal. For 
example, the presence of CUP1 allows yeast to grow in the presence of copper ions [Butt et al 
(1987) Microbiol Rev. 51351]. 

Alternatively, some of the above described components can be put together into transformation 
vectors. Transformation vectors are usually comprised of a selectable marker that is either 
maintained in a replicon or developed into an integrating vector, as described above. 
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Expression and transformation vectors, either extrachromosomal replicons or integrating 
vectors, have been developed for transformation into many yeasts. For example, expression 
vectors have been developed for, inter alia, the following yeasts:Candida albicans [Kurtz, etal. 
(1986) Mol Cell Biol. 5:142], Candida maltosa [Kunze, et al (1985) /. Basic Microbiol 
25:141]. Hansenulapolymorpha [Gleeson, etal (1986) J. Gen. Microbiol 732:3459; 
Roggenkamp etal. (1986) Mol Gen. Genet. 202:302], Kluyveromyces fragilis [Das, etal. 
(1984) J. Bacteriol 158: 1 165], Kluy veromyces lactis [De Louvencourt et al (1983) J. 
Bacteriol 154:731; Van den Berg etal. (1990) Bio/Technology 8:135], Pichia guillerimondii 
[Kunze et al (1985) J. Basic Microbiol 25:141], Pichia pastoris [Cregg, et al (1985) Mol Cell 
Biol 5:3376; US Patent Nos. 4,837,148 and 4,929,555], Saccharomyces cerevisiae [Hinnen et 
al (1978) Proc Natl Acad. Set USA 75:1929; Ito etal. (1983)/. Bacteriol 755:163], 
Schizosaccharomyces pombe [Beach and Nurse (1981) Nature 500:706], and Yarrowia 
lipolytica [Davidow, et al. (1985) Curr. Genet 70:380471 Gaillardin, et al (1985) Curr. Genet. 
70:49]. 

Methods of introducing exogenous DNA into yeast hosts are well-known in the art, and usually 
include either the transformation of spheroplasts or of intact yeast cells treated with alkali 
cations. Transformation procedures usually vary with the yeast species to be transformed. See 
eg. [Kurtz et al (1986) Mol Cell Biol 6:142; Kunze et al. (1985) /. Basic Microbiol 25:141; 
Candida]; [Gleeson etal (1986)7. Gen. Microbiol 752:3459; Roggenkamp etal. (1986) Mol 
Gen. Genet. 202:302; Hansenula]; [Das et al. (1984) J. Bacteriol 158:1 165; De Louvencourt et 
al. (1983) J. Bacteriol 754:1165; Van den Berg etal (1990) Bio/Technology 5:135; 
Kluyveromyces]; [Cregg et al. (1985) Mol Cell Biol 5:3376; Kunze et al (1985) /. Basic 
Microbiol 25:141; US Patent Nos. 4,837,148 and 4,929,555; Pichia]; [Hinnen et al. (1978) 
Proc. Natl Acad. Sci. USA 75;1929; Ito et al (1983) J. Bacteriol 755:163 Saccharomyces]; 
[Beach and Nurse (1981) Nature 300:106; Schizosaccharomyces]; [Davidow et al (1985) Curr. 
Genet. 70:39; Gaillardin etal (1985) Curr. Genet 70:49; Yarrowia]. 
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Antibodies 

As used herein, the term "antibody" refers to a polypeptide or group of polypeptides composed 
of at least one antibody combining site. An "antibody combining site" is the three-dimensional 
binding space with an internal surface shape and charge distribution complementary to the 
features of an epitope of an antigen, which allows a binding of the antibody with the antigen. 
"Antibody" includes, for example, vertebrate antibodies, hybrid antibodies, chimeric antibodies, 
humanised antibodies, altered antibodies, univalent antibodies, Fab proteins, and single domain 
antibodies. 

Antibodies against the proteins of the invention are useful for affinity chromatography, 
immunoassays, and distinguishing/identifying Streptococcal proteins. 
Antibodies to the proteins of the invention, both polyclonal and monoclonal, may be prepared 
by conventional methods. In general, the protein is first used to immunize a suitable animal, 
preferably a mouse, rat, rabbit or goat Rabbits and goats are preferred for the preparation of 
polyclonal sera due to the volume of serum obtainable, and the availability of labeled anti-rabbit 
and anti-goat antibodies. Immunization is generally performed by mixing or emulsifying the 
protein in saline, preferably in an adjuvant such as Freund's complete adjuvant, and injecting 
the mixture or emulsion parenterally (generally subcutaneously or intramuscularly). A dose of 
50-200 (ig/injection is typically sufficient. Immunization is generally boosted 2-6 weeks later 
with one or more injections of the protein in saline, preferably using Freund's incomplete 
adjuvant. One may alternatively generate antibodies by in vitro immunization using methods 
known in the art, which for the purposes of this invention is considered equivalent to in vivo 
immunization. Polyclonal antisera is obtained by bleeding the immunized animal into a glass or 
plastic container, incubating the blood at 25QC for one hour, followed by incubating at 4DC for 
2-18 hours. The serum is recovered by centrifugation (eg. 1,000$ for 10 minutes). About 20-50 
ml per bleed may be obtained from rabbits. 

Monoclonal antibodies are prepared using the standard method of Kohler & Milstein [Nature 
(1975) 256:495-96], or a modification thereof. Typically, a mouse or rat is immunized as 
described above. However, rather than bleeding the animal to extract serum, the spleen (and 
optionally several large lymph nodes) is removed and dissociated into single cells. If desired, 
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the spleen cells may be screened (after removal of nonspecifically adherent cells) by applying a 
cell suspension to a plate or well coated with the protein antigen. B-cells expressing 
membrane-bound immunoglobulin specific for the antigen bind to the plate, and are not rinsed 
away with the rest of the suspension. Resulting B-cells, or all dissociated spleen cells, are then 
induced to fuse with myeloma cells to form hybridomas, and are cultured in a selective medium 
(eg. hypoxanthine, aminopterin, thymidine medium, "HAT*). The resulting hybridomas are 
plated by limiting dilution, and are assayed for production of antibodies which bind specifically 
to the immunizing antigen (and which do not bind to unrelated antigens). The selected 
MAb-secreting hybridomas are then cultured either in vitro (eg. in tissue culture bottles or 
hollow fiber reactors), or in vivo (as ascites in mice). 

If desired, the antibodies (whether polyclonal or monoclonal) may be labeled using 
conventional techniques. Suitable labels include fluorophores, chromophores, radioactive atoms 
(particularly 32 P and 125 I), electron-dense reagents, enzymes, and ligands having specific 
binding partners. Enzymes are typically detected by their activity. For example, horseradish 
peroxidase is usually detected by its ability to convert 3,3 , ,5,5 , -tetramethy lbenzidine (TMB) to a 
blue pigment, quantifiable with a spectrophotometer. "Specific binding partner" refers to a 
protein capable of binding a ligand molecule with high specificity, as for example in the case of 
an antigen and a monoclonal antibody specific therefor. Other specific binding partners include 
biotin and avidin or streptavidin, IgG and protein A, and the numerous receptor-ligand couples 
known in the art It should be understood that the above description is not meant to categorize 
the various labels into distinct classes, as the same label may serve in several different modes. 
For example, l2S I may serve as a radioactive label or as an electron-dense reagent. HRP may 
serve as enzyme or as antigen for a MAb. Further, one may combine various labels for desired 
effect For example, MAbs and avidin also require labels in the practice of this invention: thus, 
one might label a MAb with biotin, and detect its presence with avidin labeled with X25 h or with 
an anti-biotin MAb labeled' with HRP. Other permutations and possibilities will be readily 
apparent to those of ordinary skill in the art, and are considered as equivalents within the scope 
of the instant invention. 
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Pharmaceutical Compositions 

Pharmaceutical compositions can comprise either polypeptides, antibodies, or nucleic acid of 
the invention. The pharmaceutical compositions will comprise a therapeutically effective 
amount of either polypeptides, antibodies, or polynucleotides of the claimed invention. 
The term "therapeutically effective amount" as used herein refers to an amount of a therapeutic 
agent to treat, ameliorate, or prevent a desired disease or condition, or to exhibit a detectable 
therapeutic or preventative effect Hie effect can be detected by, for example, chemical markers 
or antigen levels. Therapeutic effects also include reduction in physical symptoms, such as 
decreased body temperature. The precise effective amount for a subject will depend upon the 
subject's size and health, the nature and extent of the condition, and the therapeutics or 
combination of therapeutics selected for administration. Urns, it is not useful to specify an exact 
effective amount in advance. However, the effective amount for a given situation can be 
determined by routine experimentation and is within the judgement of the clinician. 
For purposes of the present invention, an effective dose will be from about 0.01 mg/ kg to 50 
mg/kg or 0.05 mg/kg to about 10 mg/kg of the DNA constructs in the individual to which it is 
administered. 

A pharmaceutical composition can also contain a pharmaceutically acceptable carrier. The term 
"pharmaceutically acceptable carrier" refers to a carrier for administration of a therapeutic 
agent, such as antibodies or a polypeptide, genes, and other therapeutic agents. The term refers 
to any pharmaceutical carrier that does not itself induce the production of antibodies harmful to 
the individual receiving the composition, and which may be administered without undue 
toxicity. Suitable carriers may be large, slowly metabolized macromolecules such as proteins, 
polysaccharides, polylactic acids, polyglycolic acids, polymeric amino acids, amino acid 
copolymers, and inactive virus particles. Such carriers are well known to those of ordinary skill 
in the art 

Pharmaceutically acceptable salts can be used therein, for example, mineral acid salts such as 
hydrochlorides, hydrobromides, phosphates, sulfates, and the like; and the salts of organic acids 
such as acetates, propionates, malonates, benzoates, and the like. A thorough discussion of 



80 



PATENT APPLICATION 
ATTYREFNO. 19195.002 

pharmaceutical^ acceptable excipients is available in Remington's Pharmaceutical Sciences 
(Mack Pub. Co.,NJ. 1991). 

Pharmaceutically acceptable carriers in therapeutic compositions may contain liquids such as 
water, saline, glycerol and ethanol. Additionally, auxiliary substances, such as wetting or 
emulsifying agents, pH buffering substances, and the like, may be present in such vehicles. 
Typically, the therapeutic compositions are prepared as injectables, either as liquid solutions or 
suspensions; solid forms suitable for solution in, or suspension in, liquid vehicles prior to 
injection may also be prepared. Liposomes are included within the definition of a 
pharmaceutically acceptable carrier. 

Delivery Methods 

Once formulated, the compositions of the invention can be administered directly to the subject 
Hie subjects to be treated can be animals; in particular, human subjects can be treated. 
Direct delivery of the compositions will generally be accomplished by injection, either 
subcutaneously, intraperitoneally, intravenously or intramuscularly or delivered to the 
interstitial space of a tissue. The compositions can also be administered into a lesion. Other 
modes of administration include oral and pulmonary administration, suppositories, and 
transdermal or transcutaneous applications (eg. see WO98/20734), needles, and gene guns or 
hyposprays. Dosage treatment may be a single dose schedule or a multiple dose schedule. 
See also Delivery Strategies for Antisense Oligonucleotide Therapeutics (ed. Akhtar) ISBN 
0849347785. 

Vaccines 

Vaccines according to the invention may either be prophylactic (ie. to prevent infection) or 
therapeutic (ie. to treat disease after infection). 

Such vaccines comprise immunising antigen(s), inununogen(s), polypeptide(s), protein(s) or 
nucleic acid, usually in combination with "pharmaceutically acceptable carriers," which include 
any carrier that does not itself induce the production of antibodies harmful to the individual 
receiving the composition. Suitable carriers are typically large, slowly metabolized 
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macromolecules such as proteins, polysaccharides, polylactic acids, polyglycolic acids, poly- 
meric amino acids, amino acid copolymers, lipid aggregates (such as oil droplets or liposomes), 
and inactive virus particles. Such carriers are well known to those of ordinary skill in the art. 
Additionally, these carriers may function as immunostimulating agents ("adjuvants"). 
Furthermore, the antigen or immunogen may be conjugated to a bacterial toxoid, such as a 
toxoid from diphtheria, tetanus, cholera, H. pylori, etc. pathogens. 

Preferred adjuvants to enhance effectiveness of the composition include, but are not limited to: 
(1) aluminum salts (alum), such as aluminum hydroxide, aluminum phosphate, aluminum 
sulfate, etc; (2) oil-in-water emulsion formulations (with or without other specific 
immunostimulating agents such as muramyl peptides (see below) or bacterial cell wall 
components), such as for example (a) MF59™ (WO 90/14837; Chapter 10 in Vaccine design: 
the subunit and adjuvant approach, eds. Powell & Newman, Plenum Press 1995), containing 
5% Squalene, 0.5% Tween 80, and 0.5% Span 85 (optionally containing various amounts of 
MTP-PE (see below), although not required) formulated into submicron particles using a 
microfluidizer such as Model HOY microfluidizer (Microfluidics, Newton, MA), (b) SAF, 
containing 10% Squalane, 0.4% Tween 80, 5% pluronic-blocked polymer L121, and thr-MDP 
(see below) either microfluidized into a submicron emulsion or vortexed to generate a larger 
particle size emulsion, and (c) Ribi™ adjuvant system (RAS), (Ribi Immunochem, Hamilton, 
MT) containing 2% Squalene, 02% Tween 80, and one or more bacterial cell wall components 
from the group consisting of monophosphorylipid A (MPL), trehalose dimycolate (TDM), and 
cell wall skeleton (CWS), preferably MPL + CWS (Detox™); (3) saponin adjuvants, such as 
Stimulon™ (Cambridge Bioscience, Worcester, MA) may be used or particles generated 
therefrom such as ISCOMs (immunostimulating complexes); (4) Complete Freund's Adjuvant 
(CFA) and Incomplete Freund's Adjuvant (DFA); (5) cytokines, such as interleukins (eg. IL-l, 
IL-2, IL-4, IL-5, IL-6, IL-7, IL-12, etc.\ interferons (eg. gamma interferon), macrophage 
colony stimulating factor (M-CSF), tumor necrosis factor (TNF); etc; and (6) other substances 
that act as immunostimulating agents to enhance the effectiveness of the composition. Alum 
and MF59™ are preferred. 
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As mentioned above, muramyl peptides include, but are not limited to, N-acetyl-muramyt-L- 
threonyl-D-isoglutomine (thr-MDP), N-acetyl-normui^yl-L-alanyl-D-isoglutamine (nor- 
MDP), N-acetylmuramyl-l^alanyl-D-isoglutaminy l-L-alanine-2-(l -I'-dipalmitoyl-jn-glycero^ 
hydroxyphosphoryloxy)-ethylamine (MTP-PE), etc. 

The immunogenic compositions (eg. the immunising antigen/immunogen/polypeptide/protein/ 
nucleic acid, pharmaceutical^ acceptable carrier, and adjuvant) typically will contain diluents, 
such as water, saline, glycerol, ethanol, etc. Additionally, auxiliary substances, such as wetting 
or emulsifying agents, pH buffering substances, and the like, may be present in such vehicles. 
Typically, the immunogenic compositions are prepared as injectables, either as liquid solutions 
or suspensions; solid forms suitable for solution in, or suspension in, liquid vehicles prior to 
injection may also be prepared. The preparation also may be emulsified or encapsulated in 
liposomes for enhanced adjuvant effect, as discussed above under pharmaceutically acceptable 
carriers. 

Immunogenic compositions used as vaccines comprise an immunologically effective amount of 
the antigenic or immunogenic polypeptides, as well as any other of the above-mentioned 
components, as needed. By "immunologically effective amount* \ it is meant that the adminis- 
tration of that amount to an individual, either in a single dose or as part of a series, is effective 
for treatment or prevention. This amount varies depending upon the health and physical 
condition of the individual to be treated, the taxonomic group of individual to be treated (eg. 
nonhuman primate, primate, etc.), the capacity of the individual's immune system to synthesize 
antibodies, the degree of protection desired, the formulation of the vaccine, the treating doctor's 
assessment of the medical situation, and other relevant factors. It is expected that the amount 
will fall in a relatively broad range that can be determined through routine trials. 
The immunogenic compositions are conventionally administered parenterally, eg. by injection, 
either subcutaneously, intramuscularly, or transdermally/transcutaneously (eg. WO98/20734). 
Additional formulations suitable for other modes of administration include oral and pulmonary 
formulations, suppositories, and transdermal applications. Dosage treatment may be a single 
dose schedule or a multiple dose schedule. The vaccine may be administered in conjunction 
with other immunoregulatory agents. 
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As an alternative to protein-based vaccines, DNA vaccination may be used [eg. Robinson & 
Torres (1997) Seminars in Immunol 9:271-283; Donnelly et al (1997) Anna Rev Immunol 
15:617-648; later herein]. 

Gene Delivery Vehicles 

Gene therapy vehicles for delivery of constructs including a coding sequence of a therapeutic of 
the invention, to be delivered to the mammal for expression in the mammal, can be 
administered either locally or systemically. These constructs can utilize viral or non-viral vector 
approaches in in vivo or ex vivo modality. Expression of such coding sequence can be induced 
using endogenous mammalian or heterologous promoters. Expression of the coding sequence in 
vivo can be either constitutive or regulated. 

The invention includes gene delivery vehicles capable of expressing the contemplated nucleic 
acid sequences. The gene delivery vehicle is preferably a viral vector and, more preferably, a 
retroviral, adenoviral, adeno-associated viral (AAV), herpes viral, or alphavirus vector. The 
viral vector can also be an astrovirus, coronavirus, orthomyxovirus, pappvavirus, 
paramyxovirus, parvovirus, picomavirus, poxvirus, or togavirus viral vector. See generally, 
Jolly (1994) Cancer Gene Therapy 1:51-64; Kimura (1994) Human Gene Therapy 5:845-852; 
Connelly (1995) Human Gene Therapy 6:185-193; and Kaplitt (1994) Nature Genetics 
6:148-153. 

Retroviral vectors are well known in the art and we contemplate that any retroviral gene therapy 
vector is employable in the invention, including B, C and D type retroviruses, xenotropic 
retroviruses (for example, NZB-X1, NZB-X2 and NZB9-1 (see O'Neill (1985) J. Virol. 53:160) 
polytypic retroviruses eg. MCF and MCF-MLV (see Kelly (1983) J. Virol 45:291), 
spumaviruses and lentiviruses. See RNA Tumor Viruses, Second Edition, Cold Spring Harbor 
Laboratory, 1985. 

Portions of the retroviral gene therapy vector may be derived from different retroviruses. For 

example, retrovector LTRs may be derived from a Murine Sarcoma Virus, a tRNA binding site 

from a Rous Sarcoma Virus, a packaging signal from a Murine Leukemia Virus, and an origin 

of second strand synthesis from an Avian Leukosis Virus. 
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These recombinant retroviral vectors may be used to generate transduction competent retroviral 
vector particles by introducing them into appropriate packaging cell lines (see US patent 
5,591,624). Retrovirus vectors can be constructed for site-specific integration into host cell 
DNA by incorporation of a chimeric integrase enzyme into the retroviral particle (see 
W096/37626). It is preferable that the recombinant viral vector is a replication defective 
recombinant virus. 

Packaging cell lines suitable for use with the above-described retrovirus vectors are well known 
in the art, are readily prepared (see WO95/30763 and WO92/05266), and can be used to create 
producer cell lines (also termed vector cell lines or "VCLs") for the production of recombinant 
vector particles. Preferably, the packaging cell lines are made from human parent cells {eg. 
HT1080 cells) or mink parent cell lines, which eliminates inactivation in human serum 
Preferred retroviruses for the construction of retroviral gene therapy vectors include Avian 
Leukosis Virus, Bovine Leukemia, Virus, Murine Leukemia Virus, Mink-Cell Focus-Inducing 
Virus, Murine Sarcoma Virus, Reticuloendotheliosis Virus and Rous Sarcoma Virus. 
Particularly preferred Murine Leukemia Viruses include 4070A and 1504A (Hartley and Rowe 
(1976) J Virol 19:19-25), Abelson (ATCCNo. VR-999), Friend (ATCC No. VR-245), Graffi, 
Gross (ATCC Nol VR-590), Kirsten, Harvey Sarcoma Virus and Rauscher (ATCC No. 
VR-998) and Moloney Murine Leukemia Virus (ATCC No. VR-190). Such retroviruses may be 
obtained from depositories or collections such as the American Type Culture Collection 
("ATCC") in Rockville, Maryland or isolated from known sources using commonly available 
techniques. 

Exemplary known retroviral gene therapy vectors employable in this invention include those 
described in patent applications GB2200651, EP0415731, EP0345242, EP0334301, 
WO89/02468; WO89/05349, WO89y09271, WO90/02806, WO90/07936, WO94/03622, 
W093/25698, W093/25234, WO93/11230, WO93/10218, WO91/02805, WO91/02825, 
WO95/07994, US 5,219,740, US 4,405,712, US 4,861,719, US 4,980,289, US 4,777,127, US 
5,591,624. See also Vile (1993) Cancer Res 53:3860-3864; Vile (1993) Cancer Res 
53:962-967; Ram (1993) Cancer Res 53 (1993) 83-88; Takamiya (1992) JNeurosci Res 
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33:493-503; Baba (1993) JNeurosurg 79:729-735; Mann (1983) Cell 33:153; Cane (1984) 
Proc Natl Acad Sci 81:6349; and Miller (1990) Human Gene Therapy 1. 
Human adenoviral gene therapy vectors are also known in the art and employable in this 
invention. See, for example, Berkner (1988) Biotechniques 6:616 and Rosenfeld (1991) Science 
252:431, and WO93/07283, WO93/06223, and WO93/07282. Exemplary known adenoviral 
gene therapy vectors employable in this invention include those described in the above 
referenced documents and in W094/12649, WO93/03769, W093/19191, W094/28938, 
W095/11984, WO95/00655, WO95/27071, W095/29993, W095/34671, WO96/05320, 
WO94/08026, WO94/11506, WO93/06223, W094/24299, WO95/14102, W095/24297, 
WO95/02697, W094/28152, W094/24299, WO95/09241, WO95/25807, WO95/05835, 
W094/18922 and WO95/09654. Alternatively, administration of DNA linked to killed 
adenovirus as described in Curiel (1992) Hum. Gene Ther. 3:147-154 may be employed. The 
gene delivery vehicles of the invention also include adenovirus associated virus (AAV) vectors. 
Leading and preferred examples of such vectors for use in this invention are the AAV-2 based 
vectors disclosed in Srivastava, WO93/09239. Most preferred AAV vectors comprise the two 
AAV inverted terminal repeats in which the native D-sequences are modified by substitution of 
nucleotides, such that at least 5 native nucleotides and up to 18 native nucleotides, preferably at 
least 10 native nucleotides up to 18 native nucleotides, most preferably 10 native nucleotides 
are retained and the remaining nucleotides of the D-sequence are deleted or replaced with 
non-native nucleotides. Hie native D-sequences of the AAV inverted terminal repeats are 
sequences of 20 consecutive nucleotides in each AAV inverted terminal repeat (fe. there is one 
sequence at each end) which are not involved in HP formation. The non-native replacement 
nucleotide may be any nucleotide other than the nucleotide found in the native D-sequence in 
the same position. Other employable exemplary AAV vectors are pWP-19, pWN-1 , both of 
which are disclosed in Nahreini (1993) Gene 124:257-262. Another example of such an AAV 
vector is psub201 (see Samulski (1987) J. Virol 61:3096). Another exemplary AAV vector is 
the Double-D ITR vector. Construction of the Double-D ITR vector is disclosed in US Patent 
5,478,745. Still other vectors are those disclosed in Carter US Patent 4,797,368 and Muzyczka 
US Patent 5,139,941, Chartejee US Patent 5,474,935,'and Kotin W094/288157. Yet a further 
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example of an AAV vector employable in this invention is SSV9AFABTKneo, which contains 
the AFP enhancer and albumin promoter and directs expression predominantly in the liver. Its 
structure and construction are disclosed in Su (1996) Human Gene Therapy 7:463-470. 
Additional AAV gene therapy vectors are described in US 5,354,678, US 5,173,414, US 
• 5,139,941, and US 5,252,479. 

The gene therapy vectors of the invention also include herpes vectors. Leading and preferred 
examples are herpes simplex virus vectors containing a sequence encoding a thymidine kinase 
polypeptide such as those disclosed in US 5,288,641 and EP0176170 (Roizman). Additional 
exemplary herpes simplex virus vectors include HFEM/ICP6-LacZ disclosed in WO95/04139 
(Wistar Institute), pHSVlac described in Geller (1988) Science 241:1667-1669 and in 
WO90/09441 and WO92/07945, HSV Us3::pgC-lacZ described in Fink (1992) Human Gene 
Therapy 3:1 1-19 and HSV 7134, 2 RH 105 and GAL4 described in EP 0453242 (Breakefield), 
and those deposited with the ATCC with accession numbers VR-977 and VR-260. 
Also contemplated are alpha virus gene therapy vectors that can be employed in this invention. 
Prefen-ed alpha virus vectors are Sindbis viruses vectors. Togaviruses, Semliki Forest virus 
(ATCC VR-67; ATCC VR-1247), Middleberg virus (ATCC VR-370), Ross River virus (ATCC 
VR-373; ATCC VR-1246), Venezuelan equine encephalitis virus (ATCC VR923; ATCC 
VR-1250; ATCC VR-1249; ATCC VR-532), and those described in US patents 5,091,309, 
5,217,879, and WO92/10578. More particularly, those alpha virus vectors described in US 
Serial No. 08/405,627, filed March 15, 1995.W094/21792, WO92/10578, WO95/07994, US 
5,091,309 and US 5,217,879 are employable. Such alpha viruses may be obtained from 
depositories or collections such as the ATCC in Rockville, Maryland or isolated from known 
sources using commonly available techniques. Preferably, alphavirus vectors with reduced 
cytotoxicity are used (see USSN 08/679640). 

DNA vector systems such as eukaryotic layered expression systems are also useful for 
expressing the nucleic acids of the invention. See WO95/07994 for a detailed description of 
eukaryotic layered expression systems. Preferably, the eukaryotic layered expression systems of 
the invention are derived from alphavirus vectors and most preferably from Sindbis viral 
vectors. 
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Other viral vectors suitable for use in the present invention include those derived from 
poliovirus, for example ATCC VR-58 and those described in Evans, Nature 339 (1989) 385 and 
Sabin (1973) J. Biol Standardization 1:115; rhinovirus, for example ATCC VR-1 1 10 and those 
described in Arnold (1990) J Cell Biochem L401; pox viruses such as canary pox virus or 
vaccinia virus, for example ATCC VR-1 1 1 and ATCC VR-201 0 and those described in 
Rsher-Hoch (1989) Proc Natl Acad Sci 86:317; Flexner (1989) Ann NY Acad Sci 569:86, 
Flexner (1990) Vaccine 8:17; in US 4,603,112 and US 4,769330 and WO89/01973; SV40 
virus, for example ATCC VR-305 and those described in Mulligan (1979) Nature 277:108 and 
Madzak (1992) J Gen Virol 73:1533; influenza virus, for example ATCC VR-797 and 
recombinant influenza viruses made employing reverse genetics techniques as described in US 
5,166,057 and in Enami (1990) Proc Natl Acad Sci 87:3802-3805; Enami & Palese (1991) J 
Virol 65:271 1-2713 and Luytjes (1989) Cell 59:1 10, (see also McMichael (1983) NEJ Med 
309:13, and Yap (1978) Nature 273:238 and Nature (1979) 277:108); human 
immunodeficiency virus as described in EP-0386882 and in Buchschacher (1992) /. Virol. 
66:2731; measles virus, for example ATCC VR-67 and VR-1247 and those described in EP- 
0440219; Aura virus, for example ATCC VR-368; Bebaru virus, for example ATCC VR-600 
and ATCC VR-1240; Cabassou virus, for example ATCC VR-922; Chikungunya virus, for 
example ATCC VR-64 and ATCC VR-1241; Fort Morgan Virus, for example ATCC VR-924; 
Getah virus, for example ATCC VR-369 and ATCC VR-1243; Kyzylagach virus, for example 
ATCC VR-927; Mayaro virus, for example ATCC VR-66; Mucambo virus, for example ATCC 
VR-580 and ATCC VR-1244; Ndumu virus, for example ATCC VR-371; Pixuna virus, for 
example ATCC VR-372 and ATCC VR-1245; Tonate virus, for example ATCC VR-925; 
Triniti virus, for example ATCC VR-469; Una virus, for example ATCC VR-374; Whataroa 
virus, for example ATCC VR-926; Y-62-33 virus, for example ATCC VR-375; O'Nyong virus, 
Eastern encephalitis virus, for example ATCC VR-65 and ATCC VR-1242; Western 
encephalitis virus, for example ATCC VR-70, ATCC VR-1251, ATCC VR-622 and ATCC 
VR-1252; and coronavirus, for example ATCC VR-740 and those described in Hamre (1966) 
Proc Soc Exp Biol Med 121:190. 
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Delivery of the compositions of this invention into cells is not limited to the above mentioned 
viral vectors. Other delivery methods and media may be employed such as, for example, nucleic 
acid expression vectors, polycationic condensed DNA linked or unlinked tokilled adenovirus 
alone, for example see US Serial No. 08/366,787, filed December 30, 1994 and Curiel (1992) 
Hum Gene Ther 3:147-154 ligand linked DNA, for example see Wu (1989) / Biol Chem 
264:16985-16987, eucaryotic cell delivery vehicles cells, for example see US Serial 
No.08/240,030, filed May 9, 1994, and US Serial No. 08/404,796, deposition of 
photopolymerized hydrogel materials, hand-held gene transfer particle gun, as described in US 
Patent 5,149,655, ionizing radiation as described in US5,206,152 and in W092/1 1033, nucleic 
charge neutralization or fusion with cell membranes. Additional approaches are described in 
Philip (1994) Mol Cell Biol 14:2411-2418 and in Woffendin (1994) Proc Natl Acad Sci 
91:1581-1585. 

Particle mediated gene transfer may be employed, for example see US Serial No. 60/023,867. 
Briefly, the sequence can be inserted into conventional vectors that contain conventional control 
sequences for high level expression, and then incubated with synthetic gene transfer molecules 
such as polymeric DNA-binding cations like polylysine, protamine, and albumin, linked to cell 
targeting ligands such as asialoorosomucoid, as described in Wu & Wu (1987) / Biol Chem 
262:4429-4432, insulin as described in Hucked (1990) Biochem Pharmacol 40:253-263, 
galactose as described in Plank (1992) Bioconjugate Chem 3:533-539, lactose or transferrin. 
Naked DNA may also be employed. Exemplary naked DNA introduction methods are described 
in WO 90/1 1092 and US 5,580,859. Uptake efficiency may be improved using biodegradable 
latex beads. DNA coated latex beads are efficiently transported into cells after endocytosis 
initiation by the beads. The method may be improved further by treatment of the beads to 
increase hydrophobicity and thereby facilitate disruption of the endosome and release of the 
DNA into the cytoplasm. 

Liposomes that can act as gene delivery vehicles are described in US 5,422,120, W095/13796, 
W094/23697, W091/14445 and EP-524,968. As described in USSN. 60/023,867, on non-viral 
delivery, the nucleic acid sequences encoding a polypeptide can be inserted into conventional 
vectors that contain conventional control sequences for high level expression, and then be 
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incubated with synthetic gene transfer molecules such as polymeric DNA-binding cations like 
polylysine, protamine, and albumin, linked to cell targeting ligands such as asialoorosomucoid, 
insulin, galactose, lactose, or transferrin. Other delivery systems include the use of liposomes to 
encapsulate DNA comprising the gene under the control of a variety of tissue-specific or 
ubiquitously-active promoters. Further non-viral delivery suitable for use includes mechanical 
delivery systems such as the approach described in Woffendin et al (1994) Proc. Natl Acad. 
Set USA 91(24):1 1 581-1 1585. Moreover, the coding sequence and the product of expression of 
such can be delivered through deposition of photopolymerized hydrogel materials. Other 
conventional methods for gene delivery that can be used for delivery of the coding sequence 
include, for example, use of hand-held gene transfer particle gun, as described in US 5,149,655; 
use of ionizing radiation for activating transferred gene, as described in US 5,206,152 and 
WO92/11033 

Exemplary liposome and polycationic gene delivery vehicles are those described in US 
5,422,120 and 4,762,915; in WO 95/13796; W094/23697; and W091/14445; in EP-0524968; 
and in Stryer, Biochemistry, pages 236-240 (1975) W.H. Freeman, San Francisco; Szoka 
(1980) Biochem Biophys Acta 600:1 ; Bayer (1979) Biochem Biophys Acta 550:464; Rivnay 
(1987) Meth Enzymol 149:1 19; Wang (1987) Proc Natl Acad Sci 84:7851 ; Plant (1989) Anal 
Biochem 176:420. 

A polynucleotide composition can comprises therapeutically effective amount of a gene therapy 
vehicle, as the term is defined above. For purposes of the present invention, an effective dose 
will be from about 0.01 mg/ kg to 50 mg/kg or 0.05 mg/kg to about 10 mg/kg of the DNA 
constructs in the individual to which it is administered. 

Delivery Methods 

Once formulated, the polynucleotide compositions of the invention can be administered (1) 
directly to the subject; (2) delivered ex vivo, to cells derived from the subject; or (3) in vitro for 
expression of recombinant proteins. The subjects to be treated can be mammals or birds. Also, 
human subjects can be treated. 
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Direct delivery of the compositions will generally be accomplished by injection, either 
subcutaneously , intraperitoneally, intravenously or intramuscularly or delivered to the 
interstitial space of a tissue. The compositions can also be administered into a lesion. Other 
modes of administration include oral and pulmonary administration, suppositories, and 
transdermal or transcutaneous applications (eg. see WO98/20734), needles, and gene guns or 
hyposprays. Dosage treatment may be a single dose schedule or a multiple dose schedule. 
Methods for the ex vivo delivery and reimplantation of transformed cells into a subject are 
known in the art and described in eg. W093/14778. Examples of cells useful in ex vivo 
applications include, for example, stem cells, particularly hematopoetic, lymph cells, 
macrophages, dendritic cells, or tumor cells. 

Generally, delivery of nucleic acids for both ex vivo and in vitro applications can be 
accomplished by the following procedures, for example, dextran-mediated transfection, calcium 
phosphate precipitation, polybrene mediated transfection, protoplast fusion, electroporation, 
encapsulation of the polynucleotide(s) in liposomes, and direct microinjection of the DNA into 
nuclei, all well known in the art 

Polynucleoti de and polypeptide pharmaceutical compositions 

The terms "polynucleotide" and "nucleic acid", used interchangeably herein, 

In addition to the pharmaceutically acceptable carriers and salts described above, the following 

additional agents can be used with polynucleotide and/or polypeptide compositions. 

AJPolvpeotides 

One example are polypeptides which include, without limitation: asioloorosomucoid (ASOR); 

transferrin; asialoglycoproteins; antibodies; antibody fragments; ferritin; interleukins; 

interferons, granulocyte, macrophage colony stimulating factor (GM-CSF), granulocyte colony 

stimulating factor (G-CSF), macrophage colony stimulating factor (M-CSF), stem cell factor 

and erythropoietin. Viral antigens, such as envelope proteins, can also be used. Also, proteins 

from other invasive organisms, such as the 17 amino acid peptide from the circumsporozoite 

protein of plasmodium falciparum known as RII. 
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B. Hormones. Vitamins, etc. 

Other groups that can be included are, for example: hormones, steroids, androgens, estrogens, 
thyroid hormone, or vitamins, folic acid. 

C. Polvalkylenes. Polysaccharides, etc. 

Also, polyalkylene glycol can be included with the desired polynucleotides/polypeptides. In a 
preferred embodiment, the polyalkylene glycol is polyethlylene glycol. In addition, mono-, di-, 
or polysaccharides can be included. In a preferred embodiment of this aspect, the 
polysaccharide is dextran or DEAE-dextran. Also, chitosan and poly(lactide-co-glycolide) 

P.Lipids, and Liposomes 

Hie desired polynucleotide/polypeptide can also be encapsulated in lipids or packaged in 
liposomes prior to delivery to the subject or to cells derived therefrom. 
Lipid encapsulation is generally 1 accomplished using liposomes which are able to stably bind or 
entrap and retain nucleic acid. The ratio of condensed polynucleotide to lipid preparation can 
vary but will generally be around 1:1 (mg DNArmicromoles lipid), or more of lipid. For a 
review of the use of liposomes as carriers for delivery of nucleic acids, see, Hug and Sleight 
{1991) Biochim. Biophys. Acta. 1097:1-17; Straubinger (1983) Meth. EnzymoL 101:512-527. 
Liposomal preparations for use in the present invention include cationic (positively charged), 
anionic (negatively charged) and neutral preparations. Cationic liposomes have been shown to 
mediate intracellular delivery of plasmid DNA (Feigner (1987) Proc. Natl Acad. ScL USA 
84:7413-7416); mRNA (Malone (1989) Proc. NatL Acad ScL USA 86:6077-6081); and 
purified transcription factors (Debs (1990) /. Biol. Chem. 265:10189-10192), in functional 
form. 

Cationic liposomes are readily available. For example, 

N[l-2,3-dioleyloxy)propyl]-N,N f N-triethylammonium (DOTMA) liposomes are available 
under the trademark Lipofectin, from GD3CO BRL, Grand Island, NY. (See, also, Feigner 
supra). Other commercially available liposomes include transfectace (DDAB/DOPE) and 
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DOTAP/DOPE (Boerhinger). Other cationic liposomes can be prepared from readily available 
materials using techniques well known in the art. See, eg. Szoka (1978) Proc. Natl Acad. Sci 
USA 75:4194-41 98; WO90/1 1092 for a description of the synthesis of DOTAP 
(l,2-bis(oleoyloxy>3-(trimethylammonio)propane) liposomes. 

Similarly, anionic and neutral liposomes are readily available, such as from Avanti Polar Lipids 
(Birmingham, AL), or can be easily prepared using readily available materials. Such materials 
include phosphatidyl choline, cholesterol, phosphatidyl ethanolamine, dioleoylphosphatidyl 
choline (DOPC), dioleoylphosphatidyl glycerol (DOPG), dioleoylphoshatidyl ethanolamine 
(DOPE), among others. These materials can also be mixed with the DOTMA and DOTAP 
starting materials in appropriate ratios. Methods for making liposomes using these materials are 
well known in the art 

The liposomes can comprise multilammelar vesicles (MLVs), small unilamellar vesicles 
(SUVs), or large unilamellar vesicles (LUVs). The various liposome-nucleic acid complexes are 
prepared using methods known in the art See eg. Straubinger (1983) Meth. Immunol 
101:512-527; Szoka (1978) Proc Natl Acad. Sci USA 75:4194-4198; Papahadjopoulos (1975) 
Biochim. Biophys. Acta 394:483; Wilson (1979) Cell 17:77); Deamer & Bangham (1976) 
Biochim. Biophys. Acta 443:629; Ostro (1977) Biochem. Biophys. Res. Commun. 76:836; Fraley 
(1979) Proc. Natl Acad. Sci. USA 76:3348); Enoch & Strittmatter (1979) Proc. Natl Acad. ScL 
USA 76:145; Fraley (1980) J. Biol Chem. (1980) 255:10431 ; Szoka & Papahadjopoulos (1978) 
Proc. NatL Acad. Sci. USA 75:145; and Schaefer-Ridder (1982) Science 215:166. 

E.Lipoproteins 

In addition, lipoproteins can be included with the polynucleotide/polypeptide to be delivered. 
Examples of lipoproteins to be utilized include: chylomicrons, HDL, IDL, LDL, and VLDL. 
Mutants, fragments, or fusions of these proteins can also be used Also, modifications of 
naturally occurring lipoproteins can be used, such as acety lated LDL. These lipoproteins can 
target the delivery of polynucleotides to cells expressing lipoprotein receptors. Preferably, if 
lipoproteins are including with the polynucleotide to be delivered, no other targeting ligand is 
included in the composition. 
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Naturally occurring lipoproteins comprise a lipid and a protein portion. The protein portion are 
known as apoproteins. At the present, apoproteins A, B, C, D, and E have been isolated and 
identified. At least two of these contain several proteins, designated by Roman numerals, AI, 

An, aiv; ci.cn, cm. 

A lipoprotein can comprise more than one apoprotein. For example, naturally occurring 
chylomicrons comprises of A, B, C & E, over time these lipoproteins lose A and acquire C & E. 
VLDL comprises A, B, C & E apoproteins, LDL comprises apoprotein B ; and HDL comprises 
apoproteins A, C, & E 

The amino acid of these apoproteins are known and are described in, for example, Breslow 

(1985) Annu Rev. Biochem 54:699; Law (1986) Adv. Exp Med. Biol. 151:162; Chen (1986) J 
Biol Chem 261 :12918; Kane (1980) Proc Natl Acad Sci USA 77:2465; and Utermann (1984) 
Hum Genet 65:232. 

Lipoproteins contain a variety of lipids including, triglycerides, cholesterol (free and esters), 
and phospholipids. The composition of the lipids varies in naturally occurring lipoproteins. For 
example, chylomicrons comprise mainly triglycerides. A more detailed description of the lipid 
content of naturally occurring lipoproteins can be found, for example, in Meth EnzymoL 128 

(1986) . The composition of the lipids are chosen to aid in conformation of the apoprotein for 
receptor binding activity. The composition of lipids can also be chosen to facilitate hydrophobic 
interaction and association with the polynucleotide binding molecule. 

Naturally occurring lipoproteins can be isolated from serum by ultracentrifugation, for instance. 
Such methods are described in Meth. EnzymoL (supra); Pitas (1980) /. Biochem. 
255:5454-5460 and Mahey (1979) J Clin. Invest 64:743-750. Lipoproteins can also be produced 
by in vitro or recombinant methods by expression of the apoprotein genes in a desired host cell. 
See, for example, Atkinson (1986) Annu Rev Biophys Chem 15:403 and Radding (1958) 
Biochim Biophys Acta 30: 443. Lipoproteins can also be purchased from commercial suppliers, 
such as Biomedical Techniologies, Inc., Stoughton, Massachusetts, USA. Further description of 
lipoproteins can be found in Zuckermann et al PCIYUS97/14465. 
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EPolvcationic Agents 

Polycationic agents can be included, with or without lipoprotein, in a composition with the 
desired polynucleotide/polypeptide to be delivered. 

Polycationic agents, typically, exhibit a net positive charge at physiological relevant pH and are 
capable of neutralizing the electrical charge of nucleic acids to facilitate delivery to a desired 
location. These agents have both to vitro, ex vivo, and in vivo applications. Polycationic agents 
can be used to deliver nucleic acids to a living subject either intramuscularly, subcutaneously, 
etc. 

The following are examples of useful polypeptides as polycationic agents: polylysine, 

polyarginine, polyornithine, and protamine. Other examples include histones, protamines, 

human serum albumin, DNA binding proteins, non-histone chromosomal proteins, coat proteins 

from DNA viruses, such as (X174, transcriptional factors also contain domains that bind DNA 

and therefore may be useful as nucleic aid condensing agents. Briefly, transcriptional factors 

such as C/CEBP, c-jun, c-fos, AP-1, AP-2, AP-3, CPF, Prot-1, Sp-1, Oct-1, Oct-2, CREP, and 

TFIID contain basic domains that bind DNA sequences. 

Organic polycationic agents include: spermine, spermidine, andpurtrescine. 

The dimensions and of the physical properties of a polycationic agent can be extrapolated from 

the list above, to construct other polypeptide polycationic agents or to produce synthetic 

polycationic agents. 

Synthetic polycationic agents which are useful include, for example, DEAE-dextran, polybrene. 
Lipofectin, and lipofectAMENE are monomers that form polycationic complexes when 
combined with polynucleotides/polypeptides. 

Immunodiaenostic Assays 

Streptococcus antigens of the invention can be used in immunoassays to detect antibody levels 
(or, conversely, anti-Streptococcus antibodies can be used to detect antigen levels). 
Immunoassays based on well defined, recombinant antigens can be developed to replace 
invasive diagnostics methods. Antibodies to Streptococcus proteins within biological samples, 
including for example, blood or serum samples, can be detected. Design of the immunoassays is 
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subject to a great deal of variation, and a variety of these are known in the art. Protocols for the 
immunoassay may be based, for example, upon competition, or direct reaction, or sandwich 
type assays. Protocols may also, for example, use solid supports, or may be by 
immunoprecipitation. Most assays involve the use of labeled antibody or polypeptide; the labels 
may be, for example, fluorescent, chemiluminescent, radioactive, or dye molecules. Assays 
which amplify the signals from the probe are also known; examples of which are assays which 
utilize biotin and avidin, and enzyme-labeled and mediated immunoassays, such as ELISA 
assays. 

Kits suitable for immunodiagnosis and containing the appropriate labeled reagents are 
constructed by packaging the appropriate materials, including the compositions of the 
invention, in suitable containers, along with the remaining reagents and materials (for example, 
suitable buffers, salt solutions, etc.) required for the conduct of the assay, as well as suitable set 
of assay instructions. 

Use of Polypeptides to Screen for Peptide Analogs and Antagonists 

Polypeptides encoded by the instant polynucleotides and corresponding full length genes can be 
used to screen peptide libraries to identify binding partners, such as receptors, from within the 
library. Peptide libraries can be synthesized according to methods known in the art (e.g. Us 
patent 5,010,175; W091/17823). Agonists or antagonists of the polypeptides if the invention 
can be screened using any available method known in the art, such as signal transduction, 
antibody binding, receptor binding, mitogenic assays, chemotaxis assays, etc. The assay 
conditions ideally should resemble the conditions under which the native activity is exhibited in 
vivo, that is, under physiologic pH, temperature, and ionic strength. Suitable agonists or 
antagonists will exhibit strong inhibition or enhancement of the native activity at concentrations 
that do not cause toxic side effects in the subject Agonists or antagonists that compete for 
binding to the native polypeptide can require concentrations equal to or greater than the native 
concentration, while inhibitors capable of binding irreversibly to the polypeptide can be added 
in concentrations on the order of the native concentration. 
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Such screening and experimentation can lead to identification of a polypeptide binding partner, 
such as a receptor, encoded by a gene or a cDNA corresponding to a polynucleotide described 
herein, and at least one peptide agonist or antagonist of the binding partner. Such agonists and 
antagonists can be used to modulate, enhance, or inhibit receptor function in cells to which the 
receptor is native, or in cells that possess the receptor as a result of genetic engineering. Further, 
if the receptor shares biologically important characteristics with a known receptor, information 
about agonist/antagonist binding can facilitate development of improved agonists/antagonists of 
the known receptor. 

Identification of anti-bacterial asents 
Drug Screening Assays 

Of particular interest in the present invention is the identification of agents that have activity in 
modulating expression of one or more of the adhesion-specific genes described herein, so as to 
inhibit infection and/or disease. Of particular interest are screening assays for agents that have a 
low toxicity for human cells. 

The term "agent" as used herein describes any molecule with the capability of altering or 
mimicking the expression or physiological function of a gene product of a differentially 
expressed gene. Generally a plurality of assay mixtures are run in parallel with different agent 
concentrations to obtain a differential response to the various concentrations. Typically, one of 
these concentrations serves as a negative control /.<?. at zero concentration or below the level of 
detection. 

Candidate agents encompass numerous chemical classes, including, but not limited to, organic 
molecules (eg. small organic compounds having a molecular weight of more than 50 and less 
than about 2,500 daltons), peptides, antisense polynucleotides, and ribozymes, and the like. 
Candidate agents can comprise functional groups necessary for structural interaction with 
proteins, particularly hydrogen bonding, and typically include at least an amine, carbonyl, 
hydroxyl or carboxyl group, preferably at least two of the functional chemical groups. The 
candidate agents often comprise cyclical carbon or heterocyclic structures and/or aromatic or 
polyaromatic structures substituted with one or more of the above functional groups. Candidate 
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agents are also found among biomolecules including, but not limited to: polynucleotides, 
peptides, saccharides, fatty acids, steroids, purines, pyrimidines, derivatives, structural analogs 
or combinations thereof. 

Candidate agents are obtained from a wide variety of sources including libraries of synthetic or 
natural compounds. For example, numerous means are available for random and directed 
synthesis of a wide variety of organic compounds and biomolecules, including expression of 
randomized oligonucleotides and oligopeptides. Alternatively, libraries of natural compounds in 
the form of bacterial, fungal, plant and animal extracts are available or readily produced. 
Additionally, natural or synthetically produced libraries and compounds are readily modified 
through conventional chemical, physical and biochemical means, and may be used to produce 
combinatorial libraries. Known pharmacological agents may be subjected to directed or random 
chemical modifications, such as acylation, alkylation, esterification, amidification, etc. to 
produce structural analogs. 

Screening of Candidate Agents In Vitro 

A wide variety of in vitro assays may be used to screen candidate agents for the desired 
biological activity, including, but not limited to, labeled in vitro protein-protein binding assays, 
protein-DNA binding assays (e.g. to identify agents that affect expression), electrophoretic 
mobility shift assays, immunoassays for protein binding, and the like. For example, by 
providing for the production of large amounts of a differentially expressed polypeptide, one can 
identify ligands or substrates that bind to, modulate or mimic the action of the polypeptide. The 
purified polypeptide may also be used for determination of three-dimensional crystal structure, 
which can be used for modeling intermolecular interactions, transcriptional regulation, etc. 
The screening assay can be a binding assay, wherein one or more of the molecules may be 
joined to a label, and the label directly or indirectly provide a detectable signal. Various labels 
include radioisotopes, fluorescers, chemiluminescers, enzymes, specific binding molecules, 
particles, e.g. magnetic particles, and the like. Specific binding molecules include pairs, such as 
biotin and streptavidin, digoxin and antidigoxin etc. For the specific binding members, the 
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complementary member would normally be labeled with a molecule that provides for detection, 
in accordance with known procedures. s 

A variety of other reagents may be included in the screening assays described herein. Where the 
assay is a binding assay, these include reagents like salts, neutral proteins, e.g. albumin, 
detergents, eta that are used to facilitate optimal protein-protein binding, protein-DNA binding, 
and/or reduce non-specific or background interactions. Reagents that improve the efficiency of 
the assay, such as protease inhibitors, nuclease inhibitors, anti-microbial agents, etc. may be 
used. The mixture of components are added in any order that provides for the requisite binding. 
Incubations are performed at any suitable temperature, typically between 4 and 40°C 
Incubation periods are selected for optimum activity, but may also be optimized to facilitate 
rapid high-throughput screening. Typically between 0.1 and 1 hours will be sufficient 
Many mammalian genes have homologs in yeast and lower animals. The study of such 
homologs 1 physiological role and interactions with other proteins in vivo or in vitro can 
facilitate understanding of biological function. In addition to model systems based on genetic 
complementation, yeast has been shown to be a powerful tool for studying protein-protein 
interactions through the two hybrid system. 

Nucleic Acid Hybridisation 

"Hybridization" refers to the association of two nucleic acid sequences to one another by 
hydrogen bonding. Typically, one sequence will be fixed to a solid support and the other will be 
free in solution. Then, the two sequences will be placed in contact with one another under 
conditions that favor hydrogen bonding. Factors that affect this bonding include: the type and 
volume of solvent; reaction temperature; time of hybridization; agitation; agents to block the 
non-specific attachment of the liquid phase sequence to the solid support (Denhardfs reagent or 
BLOTTO); concentration of the sequences; use of compounds to increase the rate of association 
of sequences (dextran sulfate or polyethylene glycol); and the stringency of the washing 
conditions following hybridization. See Sambrook et al [supra] Volume 2, chapter 9, pages 
9.47 to 9.57. 
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"Stringency" refers to conditions in a hybridization reaction that favor association of very 
similar sequences over sequences that differ. For example, the combination of temperature and 
salt concentration should be chosen that is approximately 120 to 200DC below the calculated 
Tm of the hybrid under study. The temperature and salt conditions can often be determined 
empirically in preliminary experiments in which samples of genomic DNA immobilized on 
filters are hybridized to the sequence of interest and then washed under conditions of different 
stringencies. See Sambrook etal at page 9.50. 

Variables to consider when performing, for example, a Southern blot are (1) the complexity of 
the DNA being blotted and (2) the homology between the probe and (he sequences being 
detected. The total amount of the fragment(s) to be studied can vary a magnitude of 10, from 
0.1 to l|ng for a plasmid or phage digest to 10" 9 to 10" 8 g for a single copy gene in a highly 
complex eukaryotic genome. For lower complexity polynucleotides, substantially shorter 
blotting, hybridization, and exposure times, a smaller amount of starting polynucleotides, and 
lower specific activity of probes can be used. For example, a single-copy yeast gene can be 
detected with an exposure time of only 1 hour starting with 1 pg of yeast DNA, blotting for two 
hours, and hybridizing for 4-8 hours with a probe of 10 8 cpm/pg. For a single-copy mammalian 
gene a conservative approach would start with 10 \ig of DNA, blot overnight, and hybridize 
overnight in the presence of 10% dextran sulfate using a probe of greater than 10 8 cpm/pg, 
resulting in an exposure time of -24 hours. 

Several factors can affect the melting temperature (Tm) of a DNA-DNA hybrid between the 
probe and the fragment of interest, and consequently, the appropriate conditions for 
hybridization and washing. In many cases the probe is not 100% homologous to the fragment. 
Other commonly encountered variables include the length and total G+C content of the 
hybridizing sequences and the ionic strength and formamide content of the hybridization buffer. 
The effects of all of these factors can be approximated by a single equation: 
Tm= 81 + 16.6(logi 0 Ci) + 0.4[%(G + C)]-0.6(%formamide) - 600/«-1.5(%mismatch). 
where Ci is the salt concentration (monovalent ions) and n is the length of the hybrid in base 
pairs (slightly modified from Meinkoth & Wahl (1984) Anal Biochenu 138: 267-284). 
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In designing a hybridization experiment, some factors affecting nucleic acid hybridization can 
be conveniently altered. Hie temperature of the hybridization and washes and the salt 
concentration during the washes are the simplest to adjust. As the temperature of the 
hybridization increases (te. stringency), it becomes less likely for hybridization to occur 
between strands that are nonhomologous, and as a result, background decreases. If the 
radiolabeled probe is not completely homologous with the immobilized fragment (as is 
frequently the case in gene family and interspecies hybridization experiments), the 
hybridization temperature must be reduced, and background will increase. The temperature of 
the washes affects the intensity of the hybridizing band and the degree of background in a 
similar manner. The stringency of the washes is also increased with decreasing salt 
concentrations. 

In general, convenient hybridization temperatures in the presence of 50% formamide are 42°C 
for a probe with is 95% to 100% homologous to the target fragment, 37°C for 90% to 95% 
homology, and 32°C for 85% to 90% homology. For lower homologies, formamide content 
should be lowered and temperature adjusted accordingly, using the equation above. If the 
homology between the probe and the target fragment are not known, the simplest approach is to 
start with both hybridization and wash conditions which are nonstringent If non-specific bands 
or high background are observed after autoradiography, the filter can be washed at high 
stringency and reexposed. If the time required for exposure makes this approach impractical, 
several hybridization and/or washing stringencies should be tested in parallel. 

Nucleic Acid Probe Assays 

Methods such as PCR, branched DNA probe assays, or blotting techniques utilizing nucleic 
acid probes according to the invention can determine the presence of cDNA or mRNA. A probe 
is said to "hybridize" with a sequence of the invention if it can form a duplex or double 
stranded complex, which is stable enough to be detected. 

The nucleic acid probes will hybridize to the Streptococcus nucleotide sequences of the 
invention (including both sense and antisense strands). Though many different nucleotide 
sequences will encode the amino acid sequence, the native Streptococcal sequence is preferred 
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because it is the actual sequence present in cells. mRNA represents a coding sequence and so a 
probe should be complementary to the coding sequence; single-stranded cDNA is 
complementary to mRNA, and so a cDNA probe should be complementary to the non-coding 
sequence. 

The probe sequence need not be identical to the Streptococcal sequence (or its complement) — 
some variation in the sequence and length can lead to increased assay sensitivity if the nucleic 
acid probe can form a duplex with target nucleotides, which can be detected. Also, the nucleic 
acid probe can include additional nucleotides to stabilize the formed duplex. Additional 
Streptococcus sequence may also be helpful as a label to detect the formed duplex. For 
example, a non-complementary nucleotide sequence may be attached to the 5' end of the probe, 
with the remainder of the probe sequence being complementary to a Streptococcus sequence. 
Alternatively, non-complementary bases or longer sequences can be interspersed into the probe, 
provided that the probe sequence has sufficient complementarity with the a Streptococcus 
sequence in order to hybridize therewith and thereby form a duplex which can be detected. 
The exact length and sequence of the probe will depend on the hybridization conditions {e.g. 
temperature, salt condition etc.). For example, for diagnostic applications, depending on the 
complexity of the analy te sequence, the nucleic acid probe typically contains at least 10-20 
nucleotides, preferably 15-25, and more preferably at least 30 nucleotides, although it may be 
shorter than this. Short primers generally require cooler temperatures to form sufficiently stable 
hybrid complexes with the template. 

Probes may be produced by synthetic procedures, such as the triester method of Matteucci et al. 
[J. Am. Chem. Soc. (1981) 103:3185], or according to Urdea etal [Proc. NatL Acad. Sci USA 
(1983) 80: 7461], or using commercially available automated oligonucleotide synthesizers. 
Hie chemical nature of the probe can be selected according to preference. For certain 
applications, DNA or RNA are appropriate. For other applications, modifications may be 
incorporated eg. backbone modifications, such as phosphorothioates or methylphosphonates, 
can be used to increase in vivo half-life, alter RNA affinity, increase nuclease resistance etc. 
[eg. see Agrawal & Iyer (1995) Curr Opin Biotechnol 6:12-19; Agrawal (1996) TIBTECH 
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14:376-387]; analogues such as peptide nucleic acids may also be used [eg. see Corey (1997) 
TIBTECH 15:224-229; Buchardt et al (1993) TIBTECH 1 1:384-386]. 
Alternatively, the polymerase chain reaction (PCR) is another well-known means for detecting 
small amounts of target nucleic acid. The assay is described in Mullis et aL [Meth. EnzymoL 
(1987) 155:335-350] & US patents 4,683,195 & 4,683,202. Two "primer" nucleotides hybridize 
with the target nucleic acids and are used to prime the reaction. The primers can comprise 
sequence that does not hybridize to the sequence of the amplification target (or its complement) 
to aid with duplex stability or, for example, to incorporate a convenient restriction site. 
Typically, such sequence will flank the desired Streptococcus sequence. 
A thermostable polymerase creates copies of target nucleic acids from the primers using the 
original target nucleic acids as a template. After a threshold amount of target nucleic acids are 
generated by the polymerase, they can be detected by more traditional methods, such as 
Southern blots. When using the Southern blot method, the labelled probe will hybridize to the 
Streptococcus sequence (or its complement). 

Also, mRNA or cDNA can be detected by traditional blotting techniques described in 
Sambrook et al [supra], mRNA, or cDNA generated from mRNA using a polymerase enzyme, 
can be purified and separated using gel electrophoresis. The nucleic acids on the gel are then 
blotted onto a solid support, such as nitrocellulose. The solid support is exposed to a labelled 
probe and then washed to remove any unhybridized probe. Next, the duplexes containing the 
labeled probe are detected. Typically, the probe is labelled with a radioactive moiety. 
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ABSTRACT 

The invention relates to polynucleotides which are conserved or specific to 
one or more species of Streptococcus, Streptococcus species serotypes, and/or 
serotype isolates. In particular, the invention relates to polynucleotides from 
Streptococcus which are conserved or specific to one or more of the species of S. 
pneumoniae ("pneumococcus" or "S. pn."), £ pyogenes ("group A streptococcus" or 
"GAS"), and S. agalactiae ("group B streptococcus" or "GBS")- The invention 
further relates to polynucleotides which are conserved or specific to one or more 
Streptococcal species serotypes, such as GBS serotypes la, lb, II, III, IV, V, VI, VH 
and Vm. The invention still further relates to polynucleotides which are conserved or 
specific to one or more clinical isolates of a Streptococcus species. 
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SAG0046 


463 


membrane protein, putative 



1 



TablCTTCompIete list of GBS predicted genes J^F^ 



W ORF 


Size 
(a.a.) 


Annotation 

• Will V UVU 


SAG0047 


432 


adenylosuccinate lyase 


SAG0048 


303 


transcriptional regulator, Cro/CI family 


SAG0049 


332 


Holliday junction DNA helicase RuvB 


SAG0050 


145 


oho snhotvro sine nrotein nhosnhatase low molecular wnt ohf 


SAG0051 


126 


MORN motif family protein 


SAG0052 


592 


membrane protein, putative 


SAG0053 


880 


aldehvde-alcohol dehydrogenase 


SAG0054 


338 


alcohol dehydrogenase nronanol-nreferrincr 


SAG0055 


496 


threonine svnthase 

■* WAHAIV J AA1A1U0V 


SAG0056 


412 


MAI K efflux familv nrotein 


SAG0057 


102 


rihosomal nrotein S10 


SAG0058 


208 


rihosomal nrotein T/} 


SAG0059 


207 


ribosomal nrotein T>4 


SAG0060 


98 


rihosomal nrotein T,7^ 


SAG0061 


277 


rihosomal nrotein T 9 


SAG0062 


92 


rihosomal nrotein S1Q 


SAG0063 


114 


Ail/VOVUiCU \JL \1 VWLU Xmt-mfmmt 


SAG0064 


217 




SAG0065 


137 

X. ml I 


rihnQomal nmfpiYi T 1 ^ 

llUUjUlLUu JJXULGlLi M-*l\l 


SAG0066 


68 


riViricirimfil nrntpin T OO 
xxuvsoUJLiiax jpiULviii M-imuy 


SAG0067 


86 


ri Hn^rvmnl nrotein S1 1 7 

AAUvOUl Ufll LJ1ULG11X Ol / 


SAG0068 


122 


rihosomal nrotein T*14 


SAG0069 


101 


rihosomal nrotein \flA 


SAG0070 


180 


rihosomal nrotein T.S 


SAG0071 


61 


rihosomal nrotein S1 14 mitative 


SAG0072 


132 


ribosomal nrotein S8 


SAG0073 


178 


ribosomal nrotein L.6 


SAG0074 


118 


ribosomal nrotein LI 8 


SAG0075 


164 


ribosomal protein S5 


SAG0076 


59 


ribosomal protein L30 


SAG0077 


146 


ribosomal protein L15 


SAG0078 


434 


preprotein translocase. SecY subunit 


SAG0079 


212 


adenylate kinase 


SAG0080 


72 


translation initiation factor IF-1 


SAG0081 


38 


ribosomal protein L36 


SAG0082 


121 


ribosomal protein S13 


SAG0083 


118 


ribosomal protein S 1 1 


SAG0084 


312 


DNA-directed RNA polymerase, alpha subunit 


SAG0085 


128 


ribosomal protein L17 


SAG0086 


85 


lipoprotein, putative 


SAG0087 


59 


hypothetical protein 


SAG0088 


56 


hypothetical protein 


SAG0089 


183 


conserved hypothetical protein 


SAG0090 


139 


conserved hypothetical protein 


SAG0091 


144 


transcriptional regulator ComXl, putative 


SAG0092 


230 


phosphoglycerate mutase family protein 


SAG0093 


250 


D-alanyl-D-alanihe carboxypeptidase family protein 


SAG0094 


191 


N-acetylmOi^oyl-L-alanine amidase, family 4 protein 



2 
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Complete list of GBS predicted genes 



MJU£l 



sun 



V ORF 


Size 
(a.a.) 


Annotation 


CP A r<AAftC 

bAG00y5 


3*Mt 


neat-inauciDie transcription repressor HrcA 


SAG0096 


1 on 

iyu 


heat shock protein GrpE 




ouy 


dnaK protein 


oAGOOy© 


OTA 

3/y 


dnaJ protein 


O A iTAAAA 

bAuooyy 


/i 1 < 


transcriptional regulator, GntR family 


oAGOlOO 


ico 
25 0 


tRNA pseudouridine synthase A 


O A PA1 A1 

bAGOlOl 


252 


phosphomethylpyriroidine kinase, putative 


CJ A nA 1 AO 

ib AGO 102 ; 


1 CA 

154 


conserved hypothetical protein 


■SAG0103 


1 OA 

189 


conserved hypothetical protein TIGR01440 


O A PA1 Ail 


OOA 

280 


conserved hypothetical protein 


SAGO 105 


427 


tngger factor 


O A PA1 A^ 

oACjOIOo 


1 A 1 

191 


DNA-directed RNA polymerase, delta subumt, putative 


O A /*tA1 AT i 

SAG0107 


Ct A 

534 


CTP synthase 


CI A PA1 AO 

SAG0108 


308 


conserved hypothetical protein 


CI A /\r\ 

SAGO 109 


148 


deoxy uridine 5 "-triphosphate nucleotidohydrolase 


SAG0110 


454 


DNA repair protein Rad A 


SAG01 1 1 


165 


carbonic anhydrase-related protein 


SAG0112 


439 


pyridine nucleotide-disulphide oxidoreductase family protein 


SAG0113 


484 


glutamyl-tRNA synthetase 


CI A /^A1 4 >i 

SAGOl 14 


322 


ribose ABC transporter, periplasmic D-ribose-binding protein 


SAGOl 1 5 


310 


ribose ABC transporter, permease protein 


CI A /*IA1 1 ✓* 

SAGOl 16 


492 


ribose ABC transporter, ATP-bindingjprotein 


CI A /*TA1 1 *J 

SAGOl 17 


132 


nbose ABC transporter protein RbsD 


C* A nA1 1 o 

SAGOl 18 


303 


nbokmase 


O A /"5A "1 t A 

SAGOl 19 


328 


ribose operon repressor RbsR 


O A i"5A"l OA 

SAGOl 20 


32 


hypothetical protem 


C5 A HA1 O 1 

SAG0121 


362 


permease, putative 


O A HA1 11 
oAGU122 


000 
228 


ABC transporter, ATP-binding protem 


oAGOl Z3 


223 


DNA-binding response regulator 


oAvjU1Z4 


35o 


sensor nistidine kinase 


oAAJUlZD 


3yo 


argininosuccinate synthase 


o/VOUlZO 




argininosuccinate lyase 


c a ant 07 


00/2 

zy3 


fructose-bisphosphate aldolase 


o/YIjUIZo 


JU5 


L-2-hydroxyisocaproate dehydrogenase 


0/VVFUIZ7 


OZ 


ribosomal protein L28 


SAG0130 


121 


conserved hypothetical protein 


o/VvjUl J 1 


543 


DAK2 domain protein 


oAUU 1 DJ> 


zy*f 


orrJnL aomain/rSana 7 iamily protein 


O/VVJUl jj 


JO 


conserved hypothetical protein 




yo 


nypotnetical protem 




ZHO 


amino acid ABC transporter, ATP-binding protein 


SAG0136 


516 


amino acid ABC transnorter jimino acid-bindinff nrotein/nermfiase 
protein 


SAG0137 


I 627 


conserved hypothetical protein 


SAG0138 


279 


undecaprenol kinase, putative 


SAG0139 


251 


negative regulator of competence MecA, putative 


SAG0140 


386 


glycosyl transferase, group 4 family protein 


SAG0141 


256 


ABC transporter, ATP-bindine protein 
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Complete list of GBS predicted genes 



W OPT? 


fa aJl 


Annotation 




420 


conservea nypotneucai protein 


SAGO 141 


410 


oeienocysteine lyase 


SAGO 144 


147 


xnxxu iaiixiiy protein 


SAGO 145 


472 


conserveu nypotneticai protein 


SAGO 146 

A^\J 


395 


pcmciiiiii-oiiiuing protein puiauve 


SAGO 147 


411 

™ X X 


j^-iu<inyx-x^-<uauiiic carDoxypspuoase iaixiiiy protein 


SAGO 148 


551 


ougopeptiae adv^ Transporter, suDstrate-Dinaing protein, putative 


SAG014Q 


104 


ongopepuue transporter, permease protein 


SAGO 150 


j*o 


ougopeptiae ajdi^ transporter, permease protein 


SAG01S1 


148 


ougopeptiae adl transporter, /\ i x -Dinaing protein 


SAGO 159 


110 


oligopeptide aul transporter, /vir-Dincung protein 


SAGO 151 


9R1 


H*aipnospnocynayi-^L/-memyi-ij-erytnritol Kinase 


SAGOl 54 


147 

X*T / 


aac operon repressor auck. 


SAGOl 55 




mn/i A"D/*^ 4NM#»Mn«««««*A«i A 'I'll 1- T _1 f . _ - - . A * 

zinc Act transporter, AlF-binaing protein 


SAGOl 5£ 

OAUUl «>0 


• 97H 


zinc ABC transporter, permease protein 


SAGOl 57 


TvT A 
IN A 


deoxyribonuclease-related protein, degenerate 




41 0 


tyrosyi-tKiN A synthetase 


o/WJUl J7 


/OD 


penicillin-binding protein IB, putative 




1 1 0l 

i xyx 


DNA-directed RNA polymerase, beta subunit 


QAGfll^l 
O/iAJUXOX 


IZIO 


ujn A-directed. KNA polymerase beta subunit 


-^lAGOI^O 
»>-rVVJU 1 




conserved hypothetical protein 






competence protein CglA 


SAGOl fi4 


7R7 


competence protem v^gir> 


SAGOl ^5 


151 
lJl 


conservea nypotneucai protem 


SAGOl ^ 


191 


conserved domain protein 


SAGOl 67 


194 


conservea nypotneucai protem 


SAGO 168 


197 


auciaic Kinase 


SAGOl 69 


68 


tiaiisuiipiionai reguxaxor, i^ro/^i iamny 


SAGOl 70 


45 


nypouicuvdi protein 


SAG0171 


151 

X *J ± 


xijrpuuicuwai pruLdix 


SAGOl 72 


221 




SAGOl 73 


256 


|7jr XAV/illW?— vatUvAjf icttw I vu tiU tilo C 


SAG0174 


355 


o , lntarrtvl-fiminr>nf*ntiHacp' 

^lUmmjr A Cm 1 1 1 1 k vJJwLJ UUfloP 


SAGOl 75 


79 


livnnflipticsil nmtf^in 


SAG0176 


94 


conserved hvnothetica! nrrvfpin 

Vvllwvl VVU XX Jf |lvllwtlVHl L/XwLwXXX 


SAG0177 


107 


tViiorpHovin "fiimilv nrnfpin 

UllvA UUV/Alll XvUXXlJLjf yJl. \J twxxi 


SAGOl 78 


208 


tRTsFA hindinp domairi nrr»tf»in * 

U\liXk VIUWUK W/ll ItXI 1 1 fJJLVJlVXXX 


SAG0179 


238 


conserved hvnothetical nrotein 


SAGOl 80 


131 


single-strand binding nrotein 


SAGOl 81 


214, 


hvdrolase haloacid defialoaeriAQA-liIrp f^milv 


SAGOl 82 


581 


sensor histidine kinase nntalivp 


SAGOl 83 


246 


response regulator 


SAGOl 84 


151 


conserved hypothetical protein 


SAGOl 85 


242 


membrane protein, putative 


SAGOl 86 


36 


hypothetical protein 


SAG0187 


542 


oligopeptide ABC transporter, oligopeptide-binding protein 


SAGOl 88 


325 


oligopeptide ABC transporter, permease protein 


SAGOl 89 


273 


oligopeptide ABC transporter, permease protein 
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SAG0190 


267 


peptide ABC transporter, ATP-binding protein 


SAG0191 


208 


peptide ABC transporter, ATP-binding protein 


SAG0192 


676 


PTS system, HABC components 


SAG0193 


541 


alpha amylase family protein 


SAG0194 


639 


transcriptional antiterminator, BglG family 


SAG0195 


377 


IS 1 548, transposase 


SAG0196 


66 


conserved domain protein 


SAG0197 


94 


PTS system, IIB component, putative 


SAG0198 


451 


PTS system, IIC component, putative 


SAG0199 


285 


transketolase, N-terminal subunit 


SAG0200 


309 


transketolase, C-tenninal subunit 


SAG0201 


419 


oxidoreductase, putative 


SAG0202 


89 


ribosomal protein S 1 5 


SAG0203 


709 


polyribonucleotide nucleotidyltransferase 


SAG0204 


250 


conserved hypothetical protein 


SAG0205 


194 


serine O-acetyltransferase 


SAG0206 


60 


lipoprotein, putative 


SAG0207 


447 


cysteinyl-tRNA synthetase 


SAG0208 


128 


conserved hypothetical protein 


SAG0209 


251 


RN A methyltransferase, TrmH family, group 3 


SAG0210 


172 


conserved hypothetical protein 


SAG0211 


286 


DegV family protein 


SAG0212 


32 


hypothetical protein 


SAG0213 


39 


hypothetical protein 


SAG0214 


148 


ribosomal protein LI 3 


SAG0215 


130 


ribosomal protein S9 


SAG0216 


33 


hypothetical protein 


SAG0217 


384 


site-specific recombinase, phage integrase family 


SAG0218 


158 


transcriptional regulator, Cro/CI family 


SAG0219 


101 


hypothetical protein 


SAG0220 


92 


conserved hypothetical protein 


SAG0221 


76 


hypothetical protein 


SAG0222 


108 


conserved domain protein 


SAG0223 


209 


conserved hypothetical protein, fusion 


oACj0224 


332 


replication initiation protein, putative 


DAG0225 


"I A A 

144 


hypothetical protein 


oAOUZzo 


A f O 

418 


recombination protem 


oAljUZZ/ 


156 


% „_ A ^ . _ A * < A • 

hypothetical protem 


0AUUZZ0 


111 


conserved hypothetical protein 


0 a i^jnooo 

oAouzzy 


93 


conserved hypothetical protein 


Q A iTSAOIA 

oAoUZiU 


96 


conserved hypothetical protein 


SAG0231 




nypuiucacai proiem 


SAG0232 


186 


hypothetical protein 


SAG0233 


226 


hypothetical protein 


SAG0234 


128 


hypothetical protein 


SAG0235 


93 


hypothetical protein 


SAG0236 


32 


hypothetical protein 


SAG0237 


34 


hypothetical protein 
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Kir 
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Annotation 




41 


uypouicticai protein 




9R6 


iranscnpiionai regulator MutK tamily 






transporter, putative 


SAO0941 


on 

ID 


ammo acia adl transporter, permease protein 


SAO0949 


30R 


amino acia Aot transporter, amino acia-binding protein 


S A 0094** 


91 1 

«£1 1 


QtYllttA A j MfltiLJIl--L-J - _ i - - A * 

amino acia /Vr>Ly transporter, permease protein 


SAO0944 


'JRI 

JOi 


amino acia Aot transporter, Alir-bmtling protein 


$?Afi094^ 

O rVVJ v/ <£*r D 


1 59 


protein of unknown function/lipoprotein, putative 




9£R 
ZOO 


nypotneticai protein 


SAO0947 


1 1 A 

1 10 


nypotneticai protein 


Q A 0094.51 

O/A. VJ V/^i*T O 


on 


nypotneticai protein 


SAO094Q 


1 1 A 

110 


nypotneticai protein 


c A 009*10 


iyj 


membrane protein, putative 


A 009^1 


/Z 


transcriptional regulator, Cro/CI family 


Q A 009^0 


i ft/; 
loo 


acetyltransrerase, GNAT family 


o/WJV/Z j j 


iyz 


acetyitransterase, GNAT family 


oAVjUZ^h- 


ZZo 


acetyltransferase, GNAT family 


o/VVJvJZ^j 


315 


conserved hypothetical protein 


oAvJv/Z^O 


153 


RNA polymerase sigma factor, ECF subfamily 


oAvjUZj / 


53 


lipoprotein, putative 


0/\vJlJZ3o 


ZUz 


transcnptional regulator, TetR family 




DOD 


ABC transporter efflux protein, DrrB family, putative 


o/aajtuzovJ 


Z3o 


ABC transporter, ATP-bmding protein 


jc A009£1 


lzy 


TO 1 QOI 4 ..£TT> 

JolJol, transposase Orffl 


^1A009/S9 


1Z/ 


lolJol, transposase OnA 


SA009tf* 


1/1 


hypothetical protein 


SAO0964 




conserved nypotneticai protein 


SAO0965 


ZJ3 


conserved hypothetical protein 


SAG0966 


^R9 


rsi-acetyigiucosaiiime-o-pnospnate deacetylase 


SAG0267 


1R0 


conservea nypotneucai protem 


SAG0268 


^04 


giycyi-truNA symnetase, aipna suDunit 


SAG0269 


91^ 


acyi carrier protem pnospnodiesterase, putative 


SAG0270 


670 

VP / j7 


giycyi~u\jN/\ synmeiase, oeta suounit 


SAG0271 


OJ 


coiiscrvea nypouiexicai protem 


SAG0272 


87 


liiciiiujuijic pruicLu, puiauve 


SAG0273 


502 


giyv/Croi Kinase 


SAG0274 


609 


<upnu-giycerupnospnate oxidase 


SAG0275 


232 


^ijrv^civii upuuvc idv/iuuiivir protem 


SAG0276 


44S 


^1 A I iH nvirlsicf* Tnifntix/** 
in iTuL/Xi. UAlUaowj jyuutuvc 


SAG0277 


476 


vAiixocrvcu nypumeucaxproxem 


SAG0278 


661 




SAG0279 


101 


conserved hypothetical protein 


SAG0280 


244 


ABC transporter, ATP-binding protein 


SAG0281 


534 


membrane protein, putative 


SAG0282 


461 


PTS system, IBBC components 


SAG0283 


267 


glutamate 5-kinase 


SAG0284 


417 


gamma-glutamyl phosphate reductase 


SAG0285 


298 


conserved hypothetical protein TIGR00006 



% 

TableT: 



Complete list of GBS predicted genes 
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oize 
(a. a.) 


Annotation 


oALr028o 


1 AC 


ceil division protein risL 9 putative 


oAijU2o7 


02 


penicillin-binding protein 2X 


Q A f^lAOOO 

dAvjU2oo 


330 


phospho-N-acetylmxiramoyl-pentapeptide-traiisferase 




AAH 


A 1 r-aependent RNA helicase, DEAD/DEAH box family 




2/0 


ABC transporter, substrate-binding protein 


oALr0291 


20/ 


amino acid ABC transporter, permease protein 


6AG0292 


247 


amino acid ABC transporter, ATP-binding protein | 


oAIju293 


/4 


conserved hypothetical protein 


■SAG0294 


304 


tnioredoxin reductase 


dAVj0295 


4oo 


conserved hypothetical protein 


S*ACjU29o 


273 


NAD synthetase 


oAG0297 


AAA 

444 


aminopeptidase C i 


&AG0298 


750 


pemcilhn-binding protein 1 A 


SAG0299 


"1 AA 

199 


recombmation protem U 


SAG0300 


172 


conserved hypothetical protein 


SAG0301 


40 


hypothetical protein 


SAG0302 


110 


conserved hypothetical protein 


SAG0303 


384 


conserved hypothetical protein 


SAG0304 


487 


conserved hypothetical protein 


SAG0305 


160 


automducer-2 production protem LuxS 


SAG0306 


535 


KH domain protein 


SAG0307 


33 


hypothetical protein 


SAG0308 


298 


ABC transporter, ATP-bmdingjprotein 


SAG0309 


246 


ABC transporter, permease protein, putative 


SAG0310 


361 


conserved hypothetical protem 


O A /lAO 1 1 

b AUU3 1 1 


XT A 

NA 


DNA-bmdmg response regulator, authentic point mutation 


OAGU312 


234 


conserved hypothetical protein 


oA<jU313 


oao 
209 


guanylate kinase 


oAvj0314 


104 


DNA-directed RNA polymerase, omega subunit, putative 


C A CUYX 1 ^ 


/9o 


primosomal protein N 1 




311 


metmonyl-tKN A tormyltransierase 


SAG0317 


440 


sun protein 




245 


serine/threonine phosphatase, putative 


0 auu3 1 y 


'o!>l 


serine/threonine protein kinase 


oAvjU3ZU 


231 


conserved hypothetical protein 


Q A ruy*o 1 


339 


sensor histidine kinase, putative 


O/iUUJZZ 


213 


DNA-binding response regulator 




*fOO 


hydrolase, haloacid dehalogenase family/peptidyl-prolyl cis-trans 
isomerase, cyciopmun type 






general stress protem, putative 




Z. JO 


pyruvate iormate-iyase-activanng enzyme 


SAG0326 


251 


transcrintional regulator DenR "fiimilv 


SAG0327 


327 


transcriptional regulator, putative 


SAG0328 


107 


PTS system, cellobiose-specific IIA component 


SAG0329 


106 


PTS system, cellobiose-specific 1TB component 


SAQ0330 


433 


PTS system, cellobiose-specific EC component 


SAG0331 


818 


formate acetyltransferase 


SAG0332 


222 


transaldolase family protein 
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^^o 

30Z 


— — — — 

glycerol dehydrogenase 


o/\vjU334 


one 


cysteine synthase A 




Ol 4 


conservea nypotneucai proxeui l 1vjKUOZ2>7 


oAVjU330 i 


AOO 
4Zy 


helicase, putative 


oAuUJi / 


OOI 
ZZ1 


competence protein F, putative 


^r\vjU33o 


1 ft/I 
154 


ribosomal subunit interface protein 




4jU 


aspartate kinase family protein 


oAvjU340 


Zlo 


hydrolase, haloacid dehalogenase-like family 


c a nniA i 
oALjU341 


4y 


hypothetical protein 


oAviU34Z 


Z03 


enoyl-CoA hydratase/isomerase family protein 


oAvjU343 


1 /tyl 
144 


transcriptional regulator, MarR family 


oAvjU344 


3Z3 


3-oxoacyl-(acyl«carrier-protein) synthase III 


OAUU345 


/4 


acyl carrier protein 


oALjU34o 


3iy 


enoyl-(acyl-camer-protein) reductase II 


c a rno An 
oAUU347 


1 AO 

308 


malonyl CoA-acyl carrier protein transacylase 


OAG0348 


r\ A A 

244 


3-oxoacyl-[acyl-carner protem] reductase 


DAU0349 


Jin 

410 


3-oxoacyl-(acyl-camer-protein) synthase II 


O A PAocn 

bAO0350 


166 


acetyl-CoA carboxylase, biotm carboxyl earner protem 


oAUU351 


140 


(3R>hydroxymynstoyHacyl-carrier-protein) dehydratase 


oAvjrU352 


456 


acetyl-CoA carboxylase, biotm carboxylase 


SAG0353 


291 


acetyl-CoA carboxylase, carboxyl transferase, beta subunit 


DAUU354 


257 


acetyl-CoA carboxylase, carboxyl transferase, alpha subunit 


o a /iao cc 


210 


conserved hypothetical protein 


oAvjU35o 


425 


seryl-tRNA synthetase 


q a nAico 
oAvjU33 / 


330 


membrane protein, putative 


QAnni^Q 
oAvjU33o 


120 


conserved hypothetical protein 


q Ann^o 

OAvjUJDy 


3U3 


PTS system, mannose-specific HD component 


o/\OU3 OU 


OTA 

Z/U 


PTS system, mannose-specific IIC component 


QAnn^Ai 


330 


PTS system, mannose-specific IIAB components 


D/VJU,30Z 


z/u 


hydrolase, haloacid dehalogenase-like family 


O/tlVJvI J 0-> 


10A 
15*4 


nypomexicai protem 




ZU3 


memorane protein, putative 




H/ J 


xanuiine/uracu permease iamny protein ! 


53AG03fifi 


1£Q 


conservea nypomeucai protem Hvjkuuidu 


SAG0367 


1 

1 OL/ 


ducLyiuoiisierase, vjinai iamny 


SAG0368 




proLcm ui uiLKnown iuncuon 


SAG0369 


OR 


vriinocrvcu nypuincuCai proiem 


SAG0370 


U7 


rxi i ictmiiy prutcm 


SAG0371 


167 
in/ 


Hvnrttli f*rif* 2)1 r\rAt#»in 


SAG0372 


Ota/ 


fYVYWttltrfl 4*2)1 Tirr\t^in 


SAG0373 


241 

X#*T JL 


r\Jj Vv u al JoJJUI LCI , IX lf~ ulilCUIlg pi O IC1H 


SAG0374 


344 


ABC transporter, nermease orotein 


SAG0375 


266 


conserved hypothetical protein 


SAG0376 


211 


conserved hypothetical protein TIGR00091 


SAG0377 


127 


conserved hypothetical protein 


SAG0378 


379 


N utilization substance protein A 


SAG0379 


98 


conserved hypothetical protein 


SAG0380 


100 


ribosomal protein L7A femilv 
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SAG0381 


927 


translation initiation factor lr -2 


SAG0382 


122 


nnosome-binding factor A 


SAG0383 


334 


protein of unknown function/lipoprotein, putative 


SAG0384 


138 


transcriptional repressor CopY 


SAG0385 


« A A 

744 


copper-transporter ATPase CopA 


SAG0386 


68 


copper-transporter protein CopZ 


SAG0387 


204 


membrane protein, putative 


SAG0388 


270 


hydrolase, haloacid dehalogenase-like family 


SAG0389 


880 


DNA polymerase I 


SAG0390 


146 


y% A 1 * 1 • A • i * 

CoA-bmdmg domain protein 


SAG0391 


159 


transcnptional regulator, Fur family 


SAG0392 


521 


cell wall surface anchor family protein 


SAG0393 


228 


DNA-binding response regulator 


SAG0394 


345 


sensor histidine kinase 


SAG0395 


246 


membrane protein, putative 


SAG0396 


380 


queuine tRNA-ribosyltransferase 


SAG0397 


102 


conserved hypothetical protein 


SAG0398 | 


179 


BioY family protein 


SAG0399 


258 


AtsA/ElaC family protein 


SAG0400 


168 


cytidine/deoxycytidylate deaminase family protein 


SAG0401 


44 


hypothetical protein 


SAG0402 


449 


glucose-6-phosphate isomerase 


SAG0403 


175 


5-formyltetrahydrofolate cyclo-ligase family protein 


SAG0404 


225 


rhomboid family protem 


SAG0405 


347 


• /> 4 rf"* A * iff 9 A * A A * 

protein of unknown ftmchon/lipoprotein, putative 


SAG0406 


299 


UTP-glucose-1 -phosphate undylyltransferase 


SAG0407 


338 


glycerol-3-phosphate dehydrogenase (NAD(P)+) 


SAG0408 


109 


ribonuclease P protein component 


SAG0409 


271 


SpoHIJ family protem 


SAG0410 


273 


R3H domain protem 


SAG0411 


177 


conserved hypothetical protem 


SAG0412 


258 


recX protem 


SAG0413 


451 


RNA methyltransferase, TrmA tamily 


SAG0414 


153 


conserved hypothetical protein 


SAG0415 


142 


acetyltransterase, uNAl iamily 


SAG0416 


1233 


protease, putative 


O A /Tk/f 1 *7 

oA<jU41 / 


o AO 

3U2 


glycosyl transferase, group 2 family protein 


oAljU4lo 


55b 


ribonucleoside-diphosphate reductase 2, beta subunit 


O A i~if\A 1 O 


151 


nrdl protein 


oAUU4zU 


/Zl 


noonucieo side -oipnospn ate reouciase z, aipna suuunii 






s~ r** 1 1 ••mil nlt«r(n /tA AfinVtAI* -p fl ii.-i.Zt it UttWl^ftlt 1 ! 

ceil wau sunace ancnor xamuy protein 






wuiioCi vcu lijr^juiiivti^ai i^iuiciii 


SAG0423 


132 


conserved domain protein 


SAG0424 


94 


hypothetical protein 


SAG0425 


105 


carboxymuconolactone decarboxylase family protein 


SAG0426 


131 


conserved hypothetical protein 


SAG0427 


129 


transcriptional regulator, MerR family 


SAG0428 


345 


alcohol dehydrogenase, zinc-containing 



ableT: < 



TableT: C mplete list of GBS predicted genes 



-tie 



W ORF 


Size 
(a. a.) 


Annotation 

- 


SAG0429 


Zo4 


oxidoreductase, aldo/keto reductase family 


o a /~* A/i a 

bAG0430 


2o7 


cation efflux system protein 


bAGU431 


174 


transcnptional regulator, TetR family 


SAG0432 


397 


transcnptional regulator, AraC family 


oAO0433 


1389 


surface protein Rib 


bAG0434 


61 


transposase, IS256 femily, truncation 


SAG0435 


97 


DNA-damage-mducible protem J 9 putative 


O A Z^" 1 f\A 1 C 

bAG0436 


62 


hypothetical protein 


oAG0437 


123 


lipoprotein, putative 


bAG043o 


145 


bacteriophage L54a, mtegrase, truncation 


CI A /~*f\A*5 A 

SAG0439 


\T A 

NA 


conserved hypothetical protem, degenerate 


SAG0440 


A A 

84 


conserved hypothetical protem 


SAG0441 


103 


conserved domain protein 


SAG0442 


189 


acetyltransferase, GNAT fomily 


CI A ^**/\ A A 

SAG0443 


194 


acetyltransferase, GNAT family 


Cl A f\ AAA 

SAG0444 


188 


conserved hypothetical protein 


O A f**f\ A A f 

SAG0445 


883 


valyl-tRNA synthetase 


SAG0446 


319 


oxidoreductase, Gfo/Idh/MocA family 


SAG0447 


287 


magnesium transporter, CorA family 


t~\ A ft A A A A 

SAG0448 


391 


transposase, IS256 family 


SAG0449 


354 


conserved hypothetical protein 


SAG0450 


330 


aspartate— ammonia ligase 


SAG0451 


149 


bacteriocin transport accessory protein, putative 


SAG0452 


179 


type II DNA modification methyltransferase, putative 


SAG0453 


96 


hypothetical protein 


SAG0454 


161 


phosphopantetheine adenylyltransferase 


OAG0455 


357 


conserved hypothetical protein 


SAG0456 


XT A 

NA 


conserved hypothetical protein, degenerate 


bAG0457 


192 


conserved hypothetical protein 


oAGU43o 


368 


conserved hypothetical protein HGR00048 




1/1 


VanZP domain protein 


C A n(\A 4Zf\ 

oAGU4oU 


CO 1 

JOl 


ABC transporter, ATP-bmdmg/permease protem 


Q A CXCXAfU 


D/9 


ABC transporter, ATP-bmding/permease protem 




loo 


anthranilate synthase component II 


ij/YVJU*tO:> 


1 fy 


BioY family protein 


q a cia Afjk 




biotin synthetase 






nypotneucai protem 


d/wjvj*too 


J /I 


nuoiase 


RArtOA£7 

O/\VJI/*t0 / 




AMP-binding enzyme domain protein 


<!AnnAAft 

0/\vJw*tO O 




endonuclease III 






type lv prepum pepnaase-relatea protem 


SAG0470 


69 


conserved hvoothetical nrntein 


SAG0471 


322 


glucokinase 


SAG0472 


126 


rhodanese-like family protein 


SAG0473 


613 


elongation factor Tu family protein 


SAG0474 


81 


conserved hypothetical protein 


SAG0475 


451 


UDP-N^cetylmiiramoylalanine— D-glutamate ligase 


SAG0476 


358 


UDP-N-acetylglucosamine-N-acetylmui^yl-G)entapepti^^ 
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Table 1: C mplete list of GBS predicted genes 



P ORF 


Size 
(a.a.) 


Annotation 






pyrophosphoryl-undecaprenol N-acetylglucosamine transferase 


SAG0477 


378 


cell division protein DivIB, putative 


SAG0478 


429 


cell division protein FtsA 


SAG0479 


426 


cell division protein FtsZ 


SAG0480 


224 


ylmE protein, putative 


SAG0481 


201 


ylmF protein 


SAG0482 


84 


YGGT family protein 


SAG0483 


262 


ylmH protein 


SAG0484 


256 


cell division protein DivIVA, putative 


SAG0485 


930 


isoleucyl-tRNA synthetase 


SAG0486 


100 


conserved hypothetical protein 


SAG0487 


151 


MutT/nudix family protein 


SAG0488 


753 


ATP-dependent Clp protease, ATP-binding subunit 


SAG0489 


34 


hypothetical protein 


SAG0490 


76 


conserved hypothetical protein 


SAG0491 


230 


amino acid ABC transporter, permease protein 


SAG0492 


244 


amino acid ABC transporter, ATP-binding protein 


SAG0493 


564 


phosphoglucomutase/phosphomannomutase family protein ! 


SAG0494 


284 


methylenetetrahydrofolate 

dehydrogenase/methenyltetrahydrofolate cyclohydrolase 


SAG0495 


278 


protein of unknown function 


SAG0496 


446 


exodeoxyribonuclease VII, large subunit 


SAG0497 


71 


exodeoxyribonuclease VII, small subunit 


SAG0498 


290 


geranyltranstransferase, putative 


SAG0499 


275 


hemolysin A 


SAG0500 


157 


arginine repressor ArgR, putative 


SAG0501 


552 


DN A repair protein RecN ! 


SAG0502 


278 


DegV family protein 


SAG0503 


279 


lipase/acylhydrolase 


SAG0504 


200 


conserved hypothetical protein 


SAG0505 


91 


DNA-binding protein HU 


SAGO506 


65 


hypothetical protein 


SAG0507 


310 


dihydroorotate dehydrogenase A 


SAG0508 


411 


beta-lactam resistance factor 


SAG0509 


403 


beta-lactam resistance factor 


SAG0510 


406 


murM protein, putative 


SAG0511 


270 


hydrolase, haloacid dehalogenase-like family 


SAG0512 


438 


HD domain protein 


SAG0513 


128 


conserved hypothetical protein 


SAGOS 14 


894 


cation-transporting ATPase, E1-E2 family 


SAG0515 


286 


conserved hypothetical protein 






iruciose**i,o-Di5pnospnaiaS6, puutuvc 


SAG0517 


374 


iron-sulfur cluster-binding protein, putative 


SAG0518 


NA 


peptide chain release factor 2, programmed fiameshift 


SAG0519 


230 


cell division ABC transporter, ATP-binding protein FtsE 


SAG0520 


309 


cell division ABC transporter, permease protein FtsX 


SAG0521 


236 


carboxymethylenebutenolidase-related protein 


SAG0522 


232 


metallo-beta-lactamase superfamily protein 
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Complete list of GBS predicted genes 



a? 



&OKF 


JSize 
(a. a.) 


A A mm 

Annotation 


SAG0523 


Zj4 


— n — — ■ 

oxidoreductase, short chain dehydrogenase/reductase family 


CI A PAf^ /I 

SAG0524 I 


oJD 


DNA polymerase IE, epsilon subunit/ATP-dependent helicase 

UlUKJ 


oAvjUDZD 


y? i 


aspartate aminotransferase 


oAtjrUDZO 


AAQ 


asparaginyl-tRNA synthetase 


o a nrion 
oALjUOZ / 




conserved hypothetical protein 


O A Z"" 1 AOO 


327 


inosine-uridine preferring nucleoside hydrolase 


O A /TS AO A 

bAGU529 


JO 


hypothetical protein j 


CI A nrifOA 

bAGU53U 


13/ 


UsmC/Onr family protein 


o a nnci i 
o ACjUd3 1 


29o 


conserved hypothetical protein 


C" A i^AO^ 

SAG0532 


324 


conserved hypothetical protein 


bAGU533 


303 


conserved hypothetical protein 


bAGU534 


465 


dipeptidase 


CI A /~*f\d C 

SAG0535 


506 


zinc ABC transporter, zinc-binding adhesion hprotem 


SAG0536 


86 


ribosomal protein L3 1 1 


SAG0537 


311 


DHH family protein 


CI A f^f\0 O 

SAG0538 


340 


adenosine deaminase, putative 


SAG0539 


147 


flavodoxin 


SAG0540 


91 


chorismate mutase, putative 


SAG0541 


398 


voltage-gated chloride channel family protein 


SAG0542 


127 


IS1381, transposase OrfA 


SAG0543 


129 


IS1381, transposase OrfB 


CI A S~Sf\tT A A 

SAG0544 


115 


ribosomal protein LI 9 


SAG0545 


359 


prophage LambdaSal, site-specific recombinase, phage integrase 
family 


oAGU54o 


67 


conserved domain protein 


oAGU547 


1 oc 

lo5 


hypothetical protein 


oAGU54o 


205 


prophage LambdaSal, repressor protein, putative 


Q A /^ACvlO 


47 


hypothetical protein 




/4 


conserved hypothetical protein 


Q AfTJfKCl 


<o 
DZ 


conserved hypothetical protein 


oAVJvODZ 




hypothetical protein 


o/VUUjjj 


OAS! 

zoo 


hypothetical protein 






propnage JLamoaaoai, transcnptionai regulator, Cro/Cl tamiiy 




z*ty 


propnage i^amuaaoai , antirepressor, putative 






nypouieiicai protein 




7A 
/O 


njrpoineucai protein 






nypoxneucai proiem 


SAHOSSQ 

QAVJ V/ J J 7 


zoo 


cunscrvcu nypouieucai proiein 




77 


conserves nypoineucai proiein 


SAG0561 


46 


hypothetical protein 


SAG0562 


84 


hvnothetical nrotein 


SAG0563 


53 


hypothetical protein 


SAG0564 


160 


conserved hypothetical protein j 


SAG0565 


224 


conserved domain protein 


SAG0566 


138 


prophage LambdaSal, single-strand bindingjjrotein 


SAG0567 


439 


prophage LambdaSal, reverse transcriptase/maturase family 
protein I 
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Complete list of GBS predicted genes 



VJKr 




AnnntafiAn 

/\uiiu iduon 


SAG0S6R 


67 


conserve*! VivnntTiRtirrfll tvroteiri 

V»V/XXO^*X VfcU ujpuuiwuufli |JXV/tVXXA 




158 


f*OTi«jprvf»H ViA/nrvtVi f*t\ r» 51 1 -n rot f* in 


SAG0570 


115 


hvnothetical *nrrvtpiTi 
xxjr j^iviuxwu.wcu Lyxi/iwxxx 


SAG0^71 


43 


xxjr ptJU.ieixl'Cll jJiUHClU 


SAG0572 


138 

1 JO 


v/Uiiocx vcvi xxypuuicui/cu |Jxoicxxx 


SAG0573 


54 


XXjr JJUlxXCLll/CXX pilHCxIl 


SAG0^74 


89 


cuuoci vcu. nypuuicuudi pruLcin 


SAG057S 


110 


ri\/T>rvi"ri^'fir , 5i1 TYmt/=*i n 
ii^puuivUval piUlGJxl 


SAG0^7fi 


43 


nypuiixciivcu protein 


SAG0^77 


177 


mjxxocx vcu ny pu unc ticcti protein 


SAG0S7R 


RR 
00 


vuxibcrv cu nypuuieucai. protein 


SAG0S7Q 


149 


L/Uiiscr vcu nypuuieucai protein 


SAGO^RO 


111 

1 1 i 


conservea nypouieucai protein, truncation 


*2AGO^R1 


lift 


conservea nyp omencm protein 




490 
*tzz 


conservea nyp oin cxicui protein 




40£ 


conservea nypotneucai protein 




OZ 


conserved hypothetical protein, truncation 




Alt 


conserved hypothetical protein 


QAH^^fiA 


1 ^4 


conserved hypothetical protein 


oAUujo / 


D\JU 


prophage LambdaSal , structural protein, putative 


oAUUjoo 


/ 1 


conservea nypotneticai protein 






conservea nypotneticai protein 




1 1Z 


conserved hypothetical protein 




/O 


conservea njrpotneticai protein 


SAG0592 


ill 


conserved hypothetical protein 


Or\\J\JDyD 


152^ 

lOJ 


propnage l^amoaaoai , structural protein 


CIAG0W4 


ol 


conservea nypotneticai protein 




1ZJ 


conservea nypotneucai protein 




£70 


propnage LamDaaoai, poi/v protein, internal ueietion 


SAG0597 




prupndge jLxunoudoai , minor structural protein, putative 


SAG0598 


1374 


propndge ivdinoadodi, xN-aceiyixnurauioyi-JU-aiaixine anuoase, 

■familv 4 

Xtl 1 1 llXjf "T 


SAG0599 


668 


tit*r>TinyiCJ r P k T .ninriHci^si 1 ttiinnr c+mr^tiirsil nrotf^iri ■niitafii/^» 
pxupxxcigv? LfOiuuiiatjai, xxxxxlvJI oUlxVslUxcU, pi VJ tClil, ptllctUVC 


SAG0600 


109 


UjrjJvlUCUvCU piv/Lv/llX 


SAG0601 


70 


hvnothetical nrntein 


SAG0602 


100 


conserved hvoothetical rvrntein 

VVJUovi VvU XXJr J^V/UXwl>XVCU LyXVJLwXlX 


SAG0603 


111 


conserved hvnothetical nrntpin 

vvlluVl V v\l XXjr lyvUXwLXwC&l |JXV/LwXXA 


SAG0604 


239 


nronhaffe I .ambdaSal 1v«?in rmtatrve 

|/*v|'l*ttgv XJtUiluUWtJOJ> , X jr OXXX, pilulUYv 


SAQO605 


323 


conserved hvnothetical nrntein 


SAG0606 


66 


conserved hvnothetical nrotein 

VV/U-MVi T WU U J f ^ Ui v U VCU |JX%/lwXXX 


SAG0607 


56 


conserved hvnothetical "nrotein 


SAG0608 


59 


hypothetical protein 


SAG0609 


NA 


prophage LambdaSal, integrase, degenerate 


SAG0610 


134 


conserved hypothetical protein 


SAG0611 


NA 


transposase, degenerate 


SAG0612 


53 


conserved hypothetical protein 


SAG0613 


425 


transmembrane protein Vexp 1 | 


SAG0614 


218 


ABC transporter, ATP-binding protein Vexp2 
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Complete list of GBS predicted genes 



W OUT? 


Sire 
/a a ^ 


/iuauiduoii 


SAG0615 


458 


trariQmf »mHranf» n-rntein Vevn3 


SAG0616 


217 


T^nVTA— hififtincf r^cnoriQf* Tecnilatnr VnrP 

J^IN IJIXlVJllIg ICdyv/lujW IwgUialUl V 11V/XV 


SAG0617 


439 


Ovlloul JLUdllUlXXw IV jllflflw V JLlwO 


SAG061& 


195 


tranQrincjicf* OtFR tjhtiiIv tn mf* si ti^n 
ucuiDpiJocxoc vjixOy loj laiuuij , u uiiwciiioii 


SAG0619 


66 


wvju&cr v cm nypuincuvai piuicui 


SAG0690 


62 


■lijr |J U 1XLC LlVcU. prULClll 


SAG0621 


401 


jlv/u. ajuapc- vie ici nulling pruicin ivuci/a., pULaXl V C L) 


SAG0622 


1 R6 


iiyuroiaSC, flalOaClu UCllalUgCIlabC-lIKc XaHlliy 


SAG0623 


650 


JLALNZ-V gyiaaC, JO ollO Lull I 


SAG0694 




^cpiauon ring iorroaiion regulator jdzt/v, putative 


SAG069S 


913 


piiu&pno serine pnospnaiasc oero 


SAG0696 


161 


iyiui i /nuuix laniiiy proiein 


SAG0697 


151 
vox 


conscrvca nypouicxicai proxein 


SAG069R 


HO J 


cnuiasc 


SAfiO,69Q 

O rA. VJVr QZ ¥ 




conservea aomain protein 




HZ/ 


j -pnospnosjiiiKitjtime i -carDoxyvinyitransierase 


SAG0631 


170 


shikimate kinase 


c a nn^n 


4D/ 


psr protein 


Q Af3ffc#%33 




KN A metnyltransierase, TrmA family 




/U 


hypothetical protein 


O/\vJU0J3 




acid phosphatase, class B 




1*79 


conserved hypothetical protein 




KF A 


transcriptional regulator, TetR family, putative, authentic 
xxamesniii | 




1 HQ 
ivy 


ceii wau sunace ancnor iamiiy protein, truncation 


SAG0639 


971 


uoiisposaoe wild, loo iainiiy 


SAG0640 


91 


ucuiypubcibe ^rLrv, loj iainiiy 


SAG0641 


NA 


i yjiL iu pruicin, ucgcneiaic 


SAG0642 


59 


Xiy pU U1C LlVvdl piULClXl 


SAG0643 


NA 




SAG0644 


402 


iTfl n Qftri v\tt r»Ti a 1 T^onilafrtr Araf^ familv 
u cuir>^/X x^-/ iiuiicii l^guialAJl, jcVi av/ lulllliy 


SAG0645 


554 


CpII Wflll ci ir^ja plp» nnpfinr familx/ nrntpin 
^vii wau ^viiiciwv cuivaiv/i iqliniy pi VI LC 111 


SAG0646 


307 


cell wall ciirfViiv* s^nrVtcw "Pamtlv rtrAt^in 
wii wau oiuicivv cuiV/iAVii Acuxuiy pioiciu 


SAG0647 


305 


sortase "femilv wotein 


SAG0648 


260 


55ortase familv nrntein 


SAG0649 


890 


cell wall surface anchor fatnilv nmtf*in mitntivp 


SAG0650 


189 


sortas e familv tirotein 

0W&M4dW JLWIlllMJf pivlwiU 


SAG0651 


! 201 


nrotein of unknown fiinction 


SAG0652 


NA 


Tn5252^ Orf 28 orotein degenerate 


SAG0653 


NA 


conserved hvnothetical nrotein degenerate 


SAG0654 


34 


hvDOthetical nrotein 


SAG0655 


57 


conserved hypothetical protein 


SAG0656 


36 


hypothetical protein 


SAG0657 


89 


hypothetical protein 


SAG0658 


383 


lipoprotein, putative 


SAG0659 


330 


ABC transporter, ATP-binding protein 


SAG0660 


272 


membrane protein 


SAG0661 


261 


conserved hypothetical protein 
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Annotation 


SAG0662 


101 


cylX protein 


SAG0663 


282 


cylD protein 


SAG0664 


240 


cylG protein 


SAG0665 


101 


acyl carrier protein AcpC , 


SAG0666 


158 


cylZ protein ! 


SAG0667 


309 


cyl A protein 


SAG0668 


292 


cylB protein 


SAG0669 


667 


cylE protein 


SAG0670 


317 


cylF protein 


SAG0671 


731 


cyll protein 


SAG0672 


403 


cyl J protein 


SAG0673 


191 


cylK protein 


SAG0674 


113 


hypothetical protein 


SAG0675 


171 


putative secreted protein 


SAG0676 


885 


proteinase, putative 


SAG0677 


1062 


hypothetical protein 


SAG0678 


NA 


endopeptidase O, degenerate 


SAG0679 


343 


protein of unknown function 


SAG0680 


339 


protein of unknown function 


SAG0681 


353 


conserved domain protein 


SAG0682 


409 


permease, putative 


SAG0683 


NA 


transmembrane protein Vexp3, putative, degenerate 


SAG0684 


223 


ABC transporter, ATP-binding protein 


SAG0685 


472 


conserved hypothetical protein 


SAG0686 


261 


DNA-entry nuclease, putative 


SAG0687 


212 


DedA family protein, putative 


SAG0688 


218 


ABC transporter, ATP-binding protein 


SAG0689 


257 


membrane protein, putative 


SAG0690 


272 


conserved hypothetical protein 


SAG0691 


294 


transcnptional regulator, LysR family 


SAG0692 


193 


regulatory protein, putative 


SAG0693 


377 


IS 1 548, transposase 


SAG0694 


173 


regulatory protein, putative, truncation 


SAG0695 


330 


D-lactate dehydrogenase 


SAG0696 


516 


sodium:galactoside symporter familyprotein, putative 


SAG0697 


I 341 


2-keto-3-deoxygluconate kinase 


SAG0698 


CAA 

599 


beta-glucuronidase 


CI A /*»A£ftn 

SAG0699 


AA*> 

223 


transcnptional regulator, GntR family 


SAG0700 


«Af 

205 


2-dehydro-3-deoxyphosphogluconate aldolase/4-hydroxy-2- 
oxoglutarate aldolase 


oAvjU/UI 


4oo 


glucuronate isomerase 


OaUv / \3£m 




m nilTl ODalC ucuy UTdlaSe 


SAGO703 


279 


D-mannonate oxidoreductase 


SAG0704 


270 


hydrolase, haloacid dehalogenase-like family 


SAG0705 


596 


glycosyl hydrolase, family 3 , 


SAG0706 


361 


proline dipeptidase 


SAG0707 


334 


transcriptional regulator, RegM femily 


SAG0708 


488 


alpha amylase family protein 
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V ORF 


oize 
la.a.j 


Annotation 

— 




JJZ 


glycosyl transferase, group 1 family protein 


QAf*o7io 

□Auv/Iv 




giycosyi transierase, group 1 family protein 


oAuU / i I 


A4*7 


tnreony l -ikim a synthetase 


oAUU / 1 Z 


71 A 
Z3** 


DNA-binding response regulator 


D AVJU 1 1 j 




conserved hypothetical protein 


q a nm i a 


1 QQ 
loo 


conserved hypothetical protein 


QAnn7i ^ 

oAlJU/lD 


ZIO 


amino acid ABC transporter, permease protein 


QAnn7i a 
oAUU / 1 0 


Z31 


amino acid ABC transporter, permease protein 


oAuU / 1 / 


7/S< 
ZOO 


ammo acid ABC transporter, ammo acid-binding protem 


Q A ITIV71 c 
oAVjU/ lo 


ZDl 


r. ^ - j ■ - J A "l"^ A >— m A (lilt « • ft 

ammo acid ABC transporter, ATP-bmding protem 




Z3o 


DNA-binding response regulator 


oAuU /ZU 


44y 


sensory box histidine kinase 


oAvjv) /Zl 


2oy 


metallo-beta-lactamase superfamily protein 


oAuU /22 


1 

122 


conserved hypothetical protein 


DAO0723 


236 


ribonuclease III 


o AIjU /Z4 


1179 


chromosome segregation SMC protein 




265 


hydrolase, haloacid dehalogenase-like family 


oACjU726 


274 


hydrolase, haloacid dehalogenase-like family 


c>AG0727 


536 


signal recognition particle-docking protein FtsY 


OAO0728 


270 


ABC transporter, substrate-binding protein 


oAOU /2SJ 


300 


ABC transporter, permease protein, putative 


oAOU73U 


42 


ABC transporter, ATP-binding protein 


oAtJU/31 


347 


bacterial luciferase family protein 


o ACjU / 3z 


720 


transcriptional accessory protein Tex, putative 1 


oAvjU/33 


1 A 1 

142 


conserved hypothetical protein 1 


oAuU /J'f 


57 


phage shock protein C, putative 






hypothetical protein 


o/iUU/JO 


311 


Jtirr(perj kinase/phosphatase 


oAvjU / $ / 




prolipoprotein diacylglyceryl transferase 


SJAG073fc 


1 37 

13Z 


conserved nypotneticaljprotein 




l*f3 


conserved hypothetical protein 


SA00740 


01 

2rl 


conserved hypothetical protein 


SAG0741 


^oi 


pepuaase, U3Z iamiiy, putative 


SAG0749 


*t\ZO 


pepuaase, uj/ iamiiy 


SAG0743 


70 


cunserveQ nypoxaeucoi protem 


SAG0744 




mcmurcuie protein, puiauve 


SAG074S 


AA£L 
HHO 


jvinzT/rezT* transporter, JNKAJVLr iamiiy 


SAG0746 




juDouavin Diosyninesis protein tODD 


SAG0747 


20ft 


ixouiicivm synuiaoc, aipna suuumt 


SAG0748 


1Q7 


iiouiiavm oiosyntnesis protem kjoa 


SAG0749 




llUUUa.VJXl bjUUlaoC, Dvia. SUDUmX 


SAG0750 


496 


lysyl-tRNA synthetase 


SAG0751 


300 


hydrolase, haloacid dehalogenase-like family 


SAG0752 


213 


phosphoglycerate mutase family protein 


SAG0753 


157 


ebsC family protein, putative 


SAG0754 


205 


conserved domain protein 


SAG0755 


282 


peptidase, U32 family 


SAG0756 


174 


conserved hypothetical protein 



16 



% 

ibIeT:< 



Table T: Complet list 



f GBS predicted genes 



WWW 't~vwrw'W7\ 

^ ORF 


Size 
(a.a.) 


A A M • 

Annotation 


oAGU757 


129 


protein of unknown ninction/lipoprotein, putative 






oligoendopeptidase F 9 putative 


c a nA7^n 


931 


phosphoenolpyruvate carboxylase 


oAGU/ol) 


377 


IS 1 548, transposase 




422 


cell division protein, FtsW/RodA/SpoVE family 


SAG0762 


398 


translation elongation factor Tu 


oAGU763 


252 


tnosephosphate isomerase 


SAG0764 


230 


phosphoglycerate mutase family protein 


ISAG0765 


681 


pemculm-binding protem 2b 


SAG0766 


198 


recombination protein RecR 


SAG0767 


348 


D-alanine— D-alanine ligase 


SAG0768 


A m m 

455 


UDP-N-acetylmmamoylalanyl-D-glutamyl-2,6-diaminopimelate-- 

1"X t _ 1 TV 111* 

D-alanyl-D-alanyl ligase 


SAG0769 


406 


oxalaterformate antiporter 


SAG0770 


228 


membrane protein, putative 


SAG0771 


512 


cell wall surface anchor family protein 


SAG0772 


514 


peptide chain release factor 3 


SAG0773 


126 


conserved hypothetical protein 


O A yi ATI A 

SAG0774 


244 


ABC transporter, ATP-binding protem 


SAG0775 


220 


ABC transporter, permease protein 


SAG0776 


276 


YaeC family protein, putative 


SAG0777 


528 


ATP-dependent RNA helicase, DEAD/DEAH box family 


SAG0778 


88 


conserved hypothetical protein 


SAG0779 


254 


conserved hypothetical protein 


SAGU780 


246 


acyltransf erase family protem 


SAG0781 


217 


competence protein CelA 




745 


DNA internalization-related competence protem ComEC/Rec2 


ISAvjU /o3 


269 


hydrolase, haloacid dehalogenase-hke femily 


oA\JU/o4 


314 


sugar-binding transcriptional regulator, LacI family 


oAuU/oj 


330 


conserved hypothetical protein 


O/VVJU / SO 


Z4Z 


conserved domain protein 


oAUU /o / 


34D 


DNA polymerase in, delta subunit, putative 


ij/\vju/oo 


ZUZ 


superoxide dismutase, Fe-Mn 


QAOft7fiQ 

■oAvju/oy 


Zo3 


transcriptional antiterminator LicT 




ozz 


PTS system, beta-glucosides-specific IIABC components 


^ A 00701 


4/3 


6-phospho-beta-glucosidase 


^ A 00707 

OAVJ VI / 


304 


conserved hypothetical protein 






giycerate Kinase z 




*rlO 


permease, urnir iamny 




JJH 


conservea nypomeucai protem 


SAn070fi 




uranscnpuonai regulator, MarJK tamily 


SAG0797 


342 


S-adenos vlmethionine * tRN A rihn wltrflriQ'fiarfl^e-i ^nmera ^t=^ 

wwvuwaj uuvuuviiuiv> VAVA ^l^k l.Uvdjr A U CUIoawjI iHOw'liJUllJLvl Clow 


SAG0798 


226 


membrane protein, putative 


SAG0799 


233 


glucosamine-6-phosphate isomerase i 


SAG0800 


318 


glutathione S-transferase family protein 


SAG0801 


239 


ribosomal small subunit pseudouridine synthase A J 


SAG0802 


38 


hypothetical protein j 


SAG0803 


383 


major facilitator family protein 
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Table^T Complete list of GBS predicted genes 



vJJKJ* 




/vunoiaiion 


SAG0R04 


315 


X^xJULx yj C Id LKj C pxl/LCXll V^VH/a. 


sagoros 


601 


olionpTiflririPT^tiHpc^ R 
uiiguciiuupcpuuaoC X> 


SAGOR06 


208 


livHroljiQf* Vinlr*jir*iH Hf»1m1ri<Tf s n;i<2PwH1n» familir 
UY UlUlaoCj llxUUClWxU VlviXCUUgvXl<tdC?~llXVC XCtlllliy 






v/ _ lxic my 1 IXcUloiCl aov xculiliy piUlClIl 


^AGftROR 




pruicdbc iiiaiuxaiiuii pruicixi, pumuvc 




161 

JLOl 


cuixocrvcu QypoiJicucai proicin 


SAGftRIO 


R79 




RAGOR1 1 


91R 


inciiiuicuic protein, putative 


^AG0R19 


779 


giycosyt transierase, iamuy o 


^AGORI^ 


R1 
0 1 


nypoinoiicai protein 


Q AGflRI d. 

ij/WJVJOlT" 




y% /-v*"\ 0 am ran ru*^ r v^r^t w% rial rvfAtat n 

coiibcrvcci nypoinexicaj proicin 


o/\vJUo 1 »> 


71 


iTanscripiionai reguiaior, v^ro/^i iainiiy 


oAUUo 1 0 


9^ 


memDrane proieni, puiauve 


o AVJUo 1 / 


1 Q7 
15/ 


xxieiriDranc protein, putative 






nDonucieosicie-aipxiospxiate reaucxase z 9 oeta. suuunit 


0AVJU0I7 


71 O 


ribonucleoside-diphosphate reductase 2, alpha subunit 


oAVjUoZU 


/4 


ribonucleoside-diphosphate reductase 2, NrdH-redoxin 


0AL7U0Z 1 


07 


phosphocarrier protein HPr 


oACjUoZZ 


D / / 


phosphoenolpyruvate-protein phosphotransferase 


oAvjUoZ,} 


4/0 


glyceraldehyde-3-phosphate dehydrogenase, NADP-dependent 


g a nnoo/i 
oAUUoZ4 


ill *7 

41 / 


polysaccharide deacetylase family protein 






a l r-aependent KIN A nelicase, Uh,/\±j/VrJ\tL box tamily 


0AOU0ZO 




uridine kinase 


C AGAR97 


! lOD 


conservea nypotneticai protein 




DD4 


DN A polymerase III, gamma and tau subunits 


^AGftR9Q 


1 A/1 


conservea nypomexicai protein ; 




•31 1 

Oil 


liiAfin -wi^Q^*it 1 _ A «-« *3 ■»*T^/*k"v\ tIica Itrroco 

oioiiii--aceLyi-v-'0/\-carDoxyiase ngase 


<nAGOR11 


1QR 


o*aaenosyxmeuuoiune syntneiase 


SAGftRV? 


7^ 


protein oi uilkjiu wii luncuon 


SAG0R33 


1R1 
101 


1i\rr\r*+1i^'tir*ci1 TiTTi+^iTi 

iiypwixiciit/cu. pxuLctii 


SAG0R34 


49 


nypouicuvax pruLciii 


SAG0R35 


1RR 
1 00 


vuubci vcu iiypumcuLrdi pruiciii 


SAG0R36 


184 

X 0*T 


r/incprvpH livt^rttViP'ripJil T%Tr%t"tf*iri 
viixxod vcu iiyp\JiixvUv«i }Ji \J Lwiii 


SAG0837 


428 


AT^r** fran cnrvrtei* A ' 1 "PJiinHIn cr nrottf*in 

/lUV' UCUXDLyWXI>Vl, mi i/lXlVXXIlg LyXwiwXIA 


SAG0838 


233 


Hvnntlifttifial rsrr*tf*in 


SAG0839 


226 


frfltKiPirirvtiAnal fPoiilfitoT Tf^nA fiamilv 

IXCULXd^XipiXVliCIX IVgUIdlvl) X vXLc\. XwXUHjr 


SAG0840 


265 


nho<!nhoiTiethvlnvrirriiHine Vinji^p 


SAG0841 


256 


hvdroxvethvlthiazole kinase 


SAG0842 


223 


txiiarnine-nhosnliate nvronhosnhorvlasB 


SAG0843 


419 

■ X ✓ 


T Ji^P-TSl-acetvlpliicosarninft 1 -carbnvvvinvltrariQfprfl<5e 

wX-^X 11 avvlnj IglUWiJtUlllUW X wCU. ISlFJVJr VXXIJr lUCUXdXwXCU>w 


SAG0844 


184 


acetvltransferase GNAT femilv 

uww j Jiu luiMAviuuV) viiiAJkl XWXXXXXj 


SAG0845 


427 


CBS domain protein 


SAG0846 


286 


methionine aminopeptidase, type I 


SAG0847 


306 


ribonuclease BN, putative 


SAG0848 


151 


GtrA family protein 


SAG0849 


169 


conserved hypothetical protein 


SAG0850 


652 


DNA ligase, NAD-dependent 


SAG0851 


339 


bmrU protein, putative 
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Complete list of GBS predicted genes 



PORF 


Size 
(a, a.) 


An not sit in ii 1 


SAG0852 


766 


PuUulariase nutative 


SAG0853 


622 


1 *4-alDha-fflucan branching enzvme 


SAG0854 


379 


glucose- 1 -nhosnhate adenvlvltransferase 


SAG0855 


NA 


dvcocen biosvnthesis nrotein GlffD authentic frarrt-MTiift 


SAG0856 


476 


slvcoffen svnthase 


SAG0857 


66 


ATP synthase F0 C subunit 


SAG0858 


238 


ATP svntha5?e FO A snhunit 


SAG0859 


165 


ATP svnthase F0 B subunit 


SAG0860 


178 


ATP svnthase Fl Helta subunit 


SAG0861 


501 


ATP svnthase Fl alnha suhurrit 


SAG0862 


293 


ATP svnthfisfi Fl ofltnmfi <nihimit 


SAG0863 


468 


ATP svnthaQe Fl hf»ta Qiihnnit 


SAG0864 


137 


A 1 P s vnth A^f* 1? 1 pn*2i Ton cnTviinit 


SAG0865 


76 


wiiDCl VvU liypuixlvlxWOJ JJ1UIC111 


SAG0866 


423 


v^xvx ~iN~airfCiyigluV*UaqJxllllG 1 "val UU Ajr V IHy 1 tXaHSIvx 3Sw 


SAG0867 


63 


vuuociYvu ny u uiw Li uai pruiciu 


SAG0868 


285 


x^lN^V^CUll jr JLLUvrlwaaC 


SAG0869 


346 


pxlCxiyieUalxyi w U\xN/\ syntXiclaSe, aipna SUDUXUX 


SAG0870 


17^ 


aUCLyiuaIlE>XCr<lSw 9 VJlN/Yi iainily 


SAG0871 


801 


piicuyi<xicuiyi-ijviN/v syninciase, ueia. suounil 


SAG0872 


300 


conserved hypothetical protein 


SAG0873 


1077 


C/Li#nucxease xvCXJd 


SAG0874 


1207 


exonuclease RexA 


SAG0875 




magnesiiuxi Txansporier, \jor/\ iamiiy, putative 


SAG0876 


458 


tRNA modification GTPase TrrnE 


SAG0877 




ADvy iTansporier, /vxJr-Dinaing protein 


SAG0878 


322 


acetoin dehydrogenase, thymine PPi dependent, El component, 

oipnct SUDUxUl 


SAG0879 




dvyCLuuj. ucnyurugcnase, my nune rri aepenQem, xii component, 

bpta subunit 

UvlO OIXL/UXXAl 


SAG0880 


462 


acetoin HebvHrriCTP.naQf* thvminp PPi r1f»r*pr»H<3nt T70 /*AmnAnoni 
avwiuui u^xijrvixugwllaow, ixiyixlllic x x 1 UCpCllvlCIli, COlIlUOIlvIlLj 

dihvdrolinoamide acetvltran^fera^p 


SAG0881 


585 


acetoin dehvdroffenase thvmine PPi denendent mmnnnpnt 

**www»** u vujr VU l/gvjU(IP %-M.Lj UXI ilv XXX UvUvllUyill* w^/XllUvlllwllLa 

dihydrolipoamide dehydrogenase 


SAG0882 


329 


lipoate-protein ligase A 


SAG0883 


261 


cobyric acid synthase, putative 


SAG0884 


447 


mur ligase family protein 


SAG0885 


283 


conserved hypothetical protein TIGR00159 


SAG0886 


319 


protein of unknown function 


SAG0887 


1 450 


phosphoRlucomutase/phosnhomannomutase familv nrotein 


SAG0888 


123 


conserved hypothetical protein 


SAG0889 


126 


conserved hypothetical protein 


SAG0890 


376 


oxygen-independent copropoiphyrinogen HI oxidase, putative 


SAG0891 


245 


conserved hypothetical protein 


SAG0892 


256 


hydrolase, haloacid dehalogenase-like family 


SAG0893 


218 


conserved hypothetical protein 


SAG0894 


1370 


protein of unknown function 


SAG0895 


289 


lipoyl-binding domain protein 
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€1 g_S 

f GBS predicted genes 



PoRF 


ft* 

Siz 
(a.a*) 


Annotation 


SAG0896 


1 AO 


— — A j r 

oxidoreductase, putative 


SAG0897 


Oil 

221 


conserved hypothetical protein 


SAG0898 


o3 


Hypothetical protein 


ft a rf't /\ f» ftfv 

SAG0899 


57 


Hypothetical protein 


SAG0900 


56 


Hypothetical protein 


ft A < 

SAG0901 


127 


\ Hypothetical protein 


ft A 

SAG0902 


A C 

45 


Hypothetical protein 


SAG0903 


/I A 

44 


hypothetical protein 


ft A /*1AA/\>I 

SAG0904 


56 


hypothetical protein 


SAG0905 


■t <"> O 

138 


nucleoside diphosphate kinase 


SAG0906 


610 


GTP-binding protein LepA 


SAG0907 


877 


protein of unknown fiinction/lipoprotein, putative 


SAG0908 


203 


HD domain protein 


SAG0909 


154 


acetyltransferase, GNAT family 


SAG0910 


144 


PilB-related protein 


SAG0911 


930 


cation-transportmg ATPase, E1-E2 family 


SAG0912 


367 


nucleoside diphosphate kinase domain protein 


SAG0913 


212 


chloramphenicol acetyltransferase 


SAG0914 


203 


conserved hypothetical protein 


SAG0915 


405 


Tn916, transposase 


SAG0916 


67 


Tn916, excisionase 


SAG0917 


83 


Tn916, hypothetical protein 


SAG0918 


76 


Tn9 1 6, hypothetical protein 


SAG0919 


157 


Tn916, hypothetical protem 


SAG0920 


23 


Tn9 1 6, hypothetical protem i 


SAG0921 


117 


Tn916, transcriptional regulator, putative 


SAG0922 


61 


Tn916, hypothetical protem 


SAG0923 


639 


Tn916, tetracycline resistance protein 


SAG0924 


28 


Tn916, tetM leader peptide 


SAG0925 


310 


Tn91o, hypothetical protein 


ft A flAA^^" 

SAG0926 


o i o 

333 


Tn916, NLP/roO tamily protein 


ft a PArtn 

SAG0927 


HOC 

725 


membrane protein, putative 


ft A 

SAG0928 


"KT A 
NA 


Inyio, nypotiieticai protem, autnentic rramesnitt 


SAGU929 


loo 


invio, nypotneticai protem 




100 


lnyio, nypomeucai protein 


o AUUy j 1 


Id 


invio, nypoxneucai proiein 


oAOUV^Z 




ln^io, uanscnptionai regulator, puiauve 






l ny i o, Jrisisjopoiiixi iamny protem 




1 Oft 


invio, nypoineiicai protem 




1UH- 


lnyio, nypotucucai protem 




30 


invio, nypouieucai protem 


SAG0937 


NA 


ABC transnorter ATP-hindinr/ orotein authentic irameshift 


SAG0938 


122 


transcriptional regulator, GntR family 


SAG0939 


1034 


DNA polymerase HI, alpha subunit 


SAG0940 


340 


6-phosphofructokinase 


SAG0941 


500 


pyruvate kinase 


SAG0942 


185 


signal peptidase I, putative 


SAG0943 


47 


hypothetical protein 
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^ ORF 


aize 


Annotation 


oAUU944 




glucosamine— mictose-o-pnospnate aminotransferase, isomenzmg 


OAGU945 


17*7 


TO 1 C/1 0 j _ . 

lolMo, transposase 


oAGU94o 


1 AO 

iuy 


phnA protein 


0 a fin ft /i t 1 
5>ALrt)947 


21 J 


amino acid ABC transporter, permease protein 


O A /""*ACI/f O 

oAGU948 


OAA 


amino acid ABC transporter, ATP-binding protem 


DAGU949 


2/6 


amino acid ABC transporter, ammo acid-binding protem 


C A 

bAvjuyso 


o2 


nbosomal protein S20 


Q A HAftC 1 

0 Avjuyj 1 




pantothenate kinase 


O A PAACO 

DAGU952 1 


19o 


conserved hypothetical protem 


C A /TAAC1 


1 in 

129 


cytidine deaminase 


O A PArtC/i 

dACjU954 


349 


protem of unknown function/lipoprotein, putative 


SAG0955 


511 


sugar ABC transporter, ATP-binding protein 


SAG0956 


353 


sugar ABC transporter, permease protein, putative 


p A PAAC1 
oAGU957 


318 


sugar ABC transporter, permease protein, putative ! 


SAG0958 


456 


NADH oxidase 


SAG0959 


329 


L-lactate dehydrogenase 


SAG0960 


819 


DNA gyrase, A subunit 


SAG0961 


247 


sortaseSrtA i 


SAG0962 


137 


glyoxylase familjrprotein 


SAG0963 


320 


conserved hypothetical protein 


SAG0964 


375 


Na+/H+ exchanger family protein 


SAG0965 


127 


IS1381, transposase OrfA 


SAG0966 


129 


IS1381, transposase OrfB 


oAG0967 


520 


GMP synthase 


SAG0968 


232 


transcriptional regulator, GntR family 


O A Z^AA/TA 

bAG0969 


AAA 

444 


gid protem 


oAU097U 


247 


acetyltransferase, GNAT family 




282 


protein of unknown function/iipoprotein, putative 


O A Of\AT> 

oAUU972 


\T A 

NA 


conserved hypothetical protein, authentic firameshift 


Q A nfiQHI 

0 Aouy fo 


ha 
32U 


nisin-resistance protein, putative j 




O^A 


AdL transporter, A 1 r- binding protem ! 


Q Af^lA07^ 


o^l 


ABC transporter, permease protein, putative 


0AVJU7 /o 


222 


DNA-binding response regulator 




J12 


sensor histidine kinase 






sixe-specinc recomomase, pnage mtegrase ramily 






AdL nansporter, suDstrate- oinuing protein 


0AUU70 \J 


0^7 


conserved nypoineucai protem 


0/iUU70 1 


zzo 


saiLi protein 






signal recognition parncie protem r in 




1 1 A 


conserved nypoineucai protein 


SAO0QR4 


4^7 


sensor msiiame Kinase v^iari 


SAG0985 


226 


DNA-bindint? resnonse remilfltnr f^ial^ 


SAG0986 


849 


aminopeptidase N 


SAG0987 


217 


phosphate transport system regulatory protein PhoU 


SAG0988 


252 


phosphate ABC transporter, ATP-binding protein PstB, putative ] 


SAG0989 


267 


phosphate ABC transporter, ATP-binding protein PstB, putative 


SAG0990 


295 


phosphate ABC transporter, permease protein PstA, putative 


SAG0991 


305 


phosphate ABC transporter, permease protein 
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Annotation 


SAG0992 


286 


phosphate ABC transporter, phosphate-binding protein 


SAG0993 


436 


NOLl/NOP2/sun family protein 


SAG0994 


254 


inositol monqphosphatase family protein 


SAG0995 


93 


conserved hypothetical protein 


SAG0996 


137 


conserved hypothetical protein 


SAG0997 


310 


macrolide-efflux protem mreA/nboflavm biosynthesis protein 
RibF 


SAG0998 


294 


tRNA pseudouridine synthase B 


SAG0999 


143 


acetyltransferase, GNAT family 


SAG1000 


423 


conserved hypothetical protein 


SAG1001 


196 


conserved hypothetical protein 


SAG1002 


292 


protease, putative I 


SAG1003 


876 


permease, putative 


SAG1004 


233 


ABC transporter, ATP-binding protein 


SAG1005 


706 


DNA topoisomerase I 


SAG1006 


280 


DprA/SMF protein, putative DNA processing factor 


SAG1007 


342 


iron-compound ABC transporter, iron-compound-binding protein 


SAG1008 


253 


iron compound ABC transporter, ATP-binding protein j 


SAG1009 


324 


iron compound ABC transporter, permease protein 


SAG1010 


320 


iron compound ABC transporter, permease protein 


SAG1011 


182 


acetyltransferase, CysE/LacA/LpxA/NodL family 1 


SAG1012 


253 


ribonuclease HII 


SAG1013 


283 


GTP-binding protein 1 


SAG1014 


190 


conserved hypothetical protein 


SAG1015 


494 


carbon starvation protein CstA, putative j 


SAG1016 


244 


response regulator 


SAG1017 


579 


sensor histidine kinase, putative 


SAG1018 


40 


lipoprotein, putative 


SAG1019 


39 


hypothetical protein 


SAG1020 


227 


lipoprotein, putative 


SAG1021 


107 


hypothetical protein I 


SAG1022 


177 


hypothetical protein 


SAG1023 


48 


hypothetical protein 


SAG1024 


183 


lipoprotein, putative 


SAG1025 


149 


hypothetical protein 


SAG1026 


"VTA 

NA 


immunogenic secreted protein, degenerate 


SAG1027 


84 


conserved hypothetical protem 


SAG1028 


196 


hypothetical protein _j 


SAG1029 


101 


hypothetical protein 


SAG1030 


304 


protein of unknown function 


SAG1031 


120 


conserved domain protein 




oD 


conserveu oypuuiciicui proicm 


SAG1033 


1309 


FtsK/SpoIIIE family protein 


SAG1034 


.55 


hypothetical protein 


SAG1035 


424 


conserved hypothetical protein 


SAG1036 


80 


conserved hypothetical protein 


SAG1037 


157 


hypothetical protein 


SAG1038 


1003 


phage infection protein, putative 
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Annotation 


SAG1039 ! 


96 


conserved hypothetical protein 


SAG1040 


260 


conserved domain protein 


SAG1041 


107 


hypothetical protein 


SAG1042 


1060 


carbamoyl-phosphate synthase, large subunit 


SAG1043 


358 


carbamoyl-phosphate synthase, small subunit 


SAG1044 


307 


aspartate carbamoyltransferase 


SAG1045 


430 


dihydroorotase, multifunctional complex type 


SAG1046 


209 


orotate phosphoribosyltransferase 


SAG1047 


233 


orotidme 5-phosphate decarboxylase 


SAG1048 


410 


membrane protein, putative 


SAG1049 


513 


ABC transporter, ATP-binding protein 


SAG1050 


112 


ribonucleotide reductase, truncation 


S AGIOS 1 


358 


aspartate-semialdehyde dehydrogenase 


SAG1052 


47 


cell wall surface anchor family protein, putative 


SAG1053 


30 


hypothetical protein 


SAG1054 


53 1 


cardiolipin synthetase 


SAG1055 


556 


formate— tetrahydrofolate ligase 


SAG1056 


339 


lipoate-protein ligase A 


SAG1057 


292 


conserved hypothetical protein 


SAG1058 


272 


conserved hypothetical protein 


SAG1059 


HO 


glycine cleavage system H protein, putative 


SAG1060 


328 


bacterial luciferase family protein 


SAG1061 


399 


oxidoreductase, FMN-binding 


SAG1062 


282 


lipoate-protein ligase A family protein 


SAG1063 


228 


flavoprotein-related protein 


SAG1064 


180 


flavoprotein family protein 


SAG1065 


190 


membrane protein, putative 


SAG1066 


572 


phosphoglucomutase 


SAG1067 


178 


IS861, transposase OrfA 


SAG1068 


277 


IS861, transposase OrfB 


SAG1069 


65 


hypothetical protein 


SAG1070 


577 


ABC transporter, ATP-binding/permease protein 


SAG1071 


573 


ABC transporter, ATP-binding/permease protein 


SAG1072 


200 


conserved hypothetical protein 


SAG1073 


325 


conserved hypothetical protein 


SAG1074 


418 


serine hydroxymethyltransferase 


SAG1075 


183 


Suao/YciO/YraC/YwiU iamily protem 


SAG 1076 


276 


modification methylase, HemK family 


ri A /""* t ATT 

SAG 1077 


359 


peptide chain release factor 1 


SAG1078 


i on 

189 


thymidine kinases , 


SAG1079 


60 


4-oxalocrotonate tautomerase 






nypoixiwuuai pruicui 


SAG1081 


312 


ApbE family protein 


SAG1082 


200 


conserved hypothetical protein 


SAG1083 


411 


conserved hypothetical protein 


SAG1084 


262 


formate/mtrite transporter family protein 


SAG1085 


424 


xanthine permease 


SAG1086 


193 


xanthine phosphoribosyltransferase 
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SAG1087 


327 


guanosine monophosphate reductase 


SAG1088 


446 


drug resistance transporter, EmrB/QacA family, putative 


SAG1089 


230 


conserved hypothetical protein 


SAG1090 


666 


potassium uptake protein, putative 


SAG1091 


216 


oxidoreductase, short chain dehydrogenase/reductase family 


SAG1092 


330 


phosphate acetyltransferase 


SAG1093 


294 


nbosomal large subunit pseudoundine synthase, RluD subfamily 


SAG1094 


278 


conserved hypothetical protein 


SAG1095 


223 


GTP pyrophosphokinase family protein j 


SAG1096 


190 


conserved hypothetical protein 


SAG1097 


324 


ribose-phosphate pyrophosphokinase 


SAG1098 


371 


cysteine desulphurase 


SAG1099 


115 


conserved hypothetical protein 


SAG1100 


210 


conserved hypothetical protein I 


SAG1101 


226 


DN A repair protein RadC 


SAG1102 


377 


membrane protein, putative 


SAG1103 


478 


6-phospho-beta-glucosidase \ 


SAG1104 


204 


platelet activating factor, putative 


SAG1105 


273 


hydrolase, haloacid dehalogenase-like family 


SAG1106 


309 


transcriptional regulator, AraC family, putative 


SAG1107 


510 


voltage-gated chloride channel family protein 


SAG1108 


357 


spermidine/putrescine ABC transporter, spermidine/putrescine- 
binding protein 


SAG1109 


258 


spermidine/putrescine ABC transporter, permease protein 


SAG1110 


264 


spermidine/putrescine ABC transporter, permease protein 


SAG11U 


384 


spermidme/putrescme ABC transporter, ATP-bmdmg protein | 


SAG1112 


300 


XJDP-N-acetylenolpyruvoylglucosamine reductase 


SAG1113 


162 


2-ammo^-hydroxy-6-hydroxymethyldihydroptendine 
pyrophosphokinase 


SAG1114 


120 


dihydroneopterm aldolase 


SAG1115 


267 


dihydropteroate synthase 


SAG1116 


187 


GTP cyclohydrolase I 


SAG1117 


420 


folylpolyglutamate synthase j 


SAG1118 


s\ r\ C 

295 


rarD protein 


SAG1119 




homosenne kinase 


SAG1120 


427 


homoserine dehydrogenase 


SAG1121 


295 


polysaccharide deacetylase family protein 


SAG 1122 


515 


transporter, BCCT family protein 


SAG1 123 


34 


hypothetical protein 


O API 10>l 

SAG1124 


/ICO 

45t» 


aldehyde dehydrogenase family protein 


SAG1125 


335 


membrane protein, putative 


O/WJl 1«£0 


JO ft 


nmtfMTi rvf* unlrrirvwn fi motion 


SAG1127 


446 


conserved domain protein 


SAG1128 


65 


transcriptional regulator, Cro/CI family 


SAG1129 


36 


hypothetical protein 


SAG1130 


49 


hypothetical protein 


SAG1131 


164 


thiol peroxidase 


SAG1132 


219 


conserved hypothetical protein 
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SAG1133 


254 


conserved hypothetical protein 


SAG1134 


213 


transcriptional regulator, GntR family/potassioum uptake protein, 
TrkA family 


SAG1135 


183 


gls24 protein, putative 


SAG1136 


65 


conserved hypothetical protein 


SAG1137 


180 


gls24 protein, putative 


SAG1138 


64 


conserved hypothetical protein 


SAG1139 


193 


conserved hypothetical protein 


SAG1140 


82 


conserved hypothetical protein 


SAG1141 


112 


conserved hypothetical protein 


SAG1142 


759 


ATP-dependent DNA helicase PcrA 


SAG1143 


128 


conserved hypothetical protein 


SAG1144 


441 


uracil permease 


SAG1145 


448 


sodium: alanine symporter family protein I 


SAG1146 


411 


cation efflux family protein 


SAG1147 


130 


conserved hypothetical protein 


SAG1148 


231 


membrane protein, putative 


SAG1149 


207 


lipoprotein, putative 


SAG1150 


400 


ribosomal protein SI 


SAG1151 


76 


conserved hypothetical protein 


SAG1152 


340 


branched-chain amino acid aminotransferase 


SAG1153 


819 


DNA topoisomerase IV, A subunit 


SAG1154 


653 


DNA topoisomerase IV, B subunit 


SAG1155 


212 


membrane protein, putative 


SAG1156 


217 


uracil-DNA glycosylase 


SAG1157 


161 


conserved hypothetical protein 


SAG1158 


413 


CMP-N-acetylneuraminic acid synthetase NeuA 


SAG1159 


209 


neuD protein 


SAG1160 


384 


UDP-N-acetylglucosamine-2-epimerase NeuC 


SAG1161 


341 


N-acetyl neuramic acid synthetase NeuB 


SAG1162 


466 


polysaccharide biosynthesis protein CpsL 


SAG1163 


318 


polysacchande biosynthesis protein CpsK(V) 


SAG1164 


321 


glycosyl transferase CpsJ(V) 


SAG1165 


327 


glycosyl transferase CpsO(V) 


SAG1166 


295 


glycosyl transferase CpsN(V) 


SAGH67 


241 


polysacchande biosynthesis protein CpsM(V) 


SAG1168 


364 


polysacchande biosynthesis protem cpsH(V) 


SAG1169 


163 


glycosyl transferase CpsG(V) 


SAG1170 


149 


polysacchande biosynthesis protem CpsF 


SAGH71 


462 


glycosyl transferase CpsE 


SAGll/2 


AAA 

229 


cpsD protem 


oAvji 1/3 


ZjU 


cpsC protein 


SAG1174 


243 


capsular polysaccharide biosynthesis protein CpsB 


SAG1175 


485 


capsular polysaccharide biosynthesis protein CpsA 


SAG1176 


290 


transcriptional regulator, LysR family, putative 


SAG1177 


255 


conserved hypothetical protein 


SAG1178 


236 


purine nucleoside phosphoiylase 


SAG1179 


418 


voltage-gated chloride channel family protein, putative i 
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SAG1180 


269 ] 


purine nucleoside phosphorylase 


SAG1181 


135 


arsenate reductase 


SAQ1182 


403 


phosphopentomutase 


SAG1183 


223 


ribose 5-phosphate isomerase 


SAG1184 


236 


conserved hypothetical protein 


SAG1185 


262 


tributyrin esterase 


SAG1186 


553 


metallo-beta-lactamase superfamily protein 


SAG1187 


253 


ABC transporter. ATP-binding protein 


SAG1188 


287 


ABC transporter, permease protein 


SAG1189 


334 


conserved hypothetical protein 


SAG1190 


551 


adherence and virulence protein A 


SAG1191 


239 


alpha-acetolactate decarboxylase 


SAG1192 


560 


acetolactate synthase, catabolic 


SAG1193 


408 


TPR domain protein 


SAG1194 


396 


membrane protein, putative 


SAG1195 


153 


MutT/nudix family protein 


SAG1196 


160 


mutator MutT protein 


SAG1197 


1072 


hyaluronidase 


SAG1198 


348 


dTDP-glucose 4,6-dehydratase 


SAG1199 


197 


dTDP-4-dehydrorhamnose 3,5-epimerase 


SAG1200 


289 


glucose-l-phosphate thymidylyltransferase 


SAG1201 


367 


iminodiacetate oxidase, putative 


SAG1202 


262 


conserved hypothetical protein TIGR00486 


SAG1203 


227 


conserved hypothetical protein 


SAG1204 


226 


DNA replication protein DnaD, putative 


SAG1205 


172 


adenine phosphoribosyltransferase 


SAG1206 


854 


| conserved domain protein 


SAG1207 


32 


hypothetical protein 


SAG1208 


732 


single-stranded-DNA-specific exonuclease RecJ 


SAG1209 


253 


oxidoreductase, short chain dehydrogenase/reductase family j 


SAG1210 


309 


metallo-beta-lactamase superfamily protein 


SAG1211 


215 


conserved hypothetical protein 


SAG1212 


412 


GTP-binding protein HflX 


SAG1213 


296 


tRNA delta(2)-isopentenylpyrophosphate transferase 


SAG1214 


58 


hypothetical protein 


SAG1215 


305 


exfoliative toxin A, putative 


SAG1216 


1252 


pullulanase, putative 


f SAG1217 


NA 


conserved hypothetical protein, authentic fxameshift 


SAG1218 


194 


conserved hypothetical protein 


SAG1219 


468 


peptidase, M20/M25/M40 family 


SAG1220 


200 


nitroreductase family protein 


SAG1221 


XT A 


1 ciyceropnospnoryi uiesier pnospnouicovciaoo, puiauw, aumvuuv 
point mutation 


SAG1222 


593 


excinuclease ABC, C subunit 


SAG1223 


I 255 


conserved hypothetical protein 


SAG1224 


446 


MATE efflux family protein 


SAG1225 


136 


conserved hypothetical protein 


| SAG1226 


165 


1 conserved hypothetical protein 
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SAG1227 


198 


protein of unknown function 


SAG1228 


96 


ISSdyl, transposase OrfA 


SAG 1229 


259 


ISSdyl, transposase OrfB 


SAG1230 


96 


conserved hypothetical protein j 


SAG1231 


NA 


transposase OrfB, IS3 family, degenerate 


SAG1232 


77 


transposase OrfB, IS3 family, truncation 


SAG1233 


822 


streptococcal histidine triad family protein 


SAG1234 


306 


laminin-binding surface protein 


SAG1235 


425 


GBSil, group II intron, maturase 


SAG1236 


NA 


C5a peptidase, authentic frameshift 


SAG1237 


444 


hypothetical protein 


SAG1238 


202 


hypothetical protein 


SAG1239 


76 


conserved hypothetical protein 


SAG1240 


125 


conserved hypothetical protein, truncation 


SAG1241 


91 


transposase OrfA, IS3 family 


SAG1242 


67 


transposase OrfB, IS3 family, truncation 


SAG1243 


96 


ISSdyl, transposase OrfA 


SAG1244 


| 259 


ISSdyl, transposase OrfB 


SAG1245 


38 


hypothetical protein 


SAG1246 


389 


hypothetical protein 


SAG1247 


! 399 


site-specific recombinase, phage integrase family 


SAG1248 


75 


conserved hypothetical protein 


SAG1249 


74 


transcriptional regulator, Cro/CI family 


SAG1250 


621 


Tn5252, relaxase 


SAG1251 


! 121 


Tn5252,Orf 9 protein J 


SAG1252 


120 


Tn5252, Orf 10 protein ^ 


SAG1253 


! 435 


transposase, ISL3 family 


SAG1254 


! 546 


mercuric reductase 


SAG1255 


130 


mercuric resistance operon regulatory protein MerR 


SAG1256 


j 142 


IS861, transposase OrfB, truncation 


SAG 1257 


709 


cation-transporting ATPase, E1-E2 family 


SAG1258 


122 


cadmium efflux system accessory protein 


CI A 4 A 

SAG1259 


99 


conserved hypothetical protein 


SAG1260 


262 


hypothetical protein 


dA012o1 


4 An 

198 


conserved hypothetical protein 


SAG1262 


695 


cation-transporting ATPase, E1-E2 family 


SAG1263 


\T A 

NA 


conserved domain protein, authentic frameshift 


dAvj1zo4 


1 A O 

148 


transcriptional repressor CopY, putative | 


oAvjlZOD 


aa<: 
206 


cadmium resistance transporter, putative 


G A/~*1 *>£/C 

oAvjIzoo 


152 


hypothetical protein 


G A /II 0£*7 




hypothetical protein 


SAG1268 


210 


rcprcabur pruiciii, puxauve 


SAG1269 


L 44 


hypothetical protein 


SAG1270 


471 


ImpB/MucB/SamB family protein 


SAG1271 


116 


conserved hypothetical protein 


SAG1272 


102 


conserved hypothetical protein 


SAG1273 


118 


conserved hypothetical protein i 


SAG1274 


129 


conserved hypothetical protein 



r. 
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SAG1275 


75 


hypothetical protein 


SAG1276 


358 


conserved hypothetical protein 


SAG1277 


163 


hypothetical protein 


SAG1278 | 


96 


hypothetical protein 


SAG1279 


99 


conserved domain protein 


SAG1280 


2274 


SNF2 family protein 


SAG1281 


183 


hypothetical protein 


SAG1282 


63 


calcium-binding protein, putative 


SAG1283 


1631 


agglutinin receptor 


SAG1284 


196 


abortive infection protein AbiGI 


SAG1285 


281 


abortive infection protein AbiGII 


SAG1286 


933 


Tn5252, Orf28 


SAG1287 


776 


Tn5252, Orf26 


SAG1288 


NA 


Tn5252, Orf25, degenerate ! 


SAG1289 


284 


Tn5252, Orf23 


SAG1290 


80 


hypothetical protein 


SAG1291 


605 


Tn5252, Orf 21 protein, internal deletion 


SAG1292 


162 


hypothetical protein 


SAG1293 


194 


protease, putative 


SAG1294 


77 


conserved hypothetical protein j 


SAG1295 


127 


conserved hypothetical protein 


SAG1296 


142 


conserved hypothetical protein 


SAG1297 


451 


C-5 cytosine-specific DNA methylase | 


SAG1298 


31 


hypothetical protein 


SAG1299 


272 


conserved hypothetical protein 


SAG1300 


57 


conserved hypothetical protein 


1 SAG1301 


121 


ribosomal protein L7/L12 


SAG1302 


166 


ribosomal protein L10 


SAG1303 


702 


ATP-dependent Clp protease, ATP-binding subunit 


SAG1304 


32 


hypothetical protein 


SAG1305 


314 


homocysteine S-methyltransferase MmuM, putative 


SAG1306 


458 


amino acid permease 


SAG1307 


216 


hypothetical protein 


SAG1308 


167 


hypothetical protein 


SAG1309 


30 


hypothetical protein 


SAG1310 


182 


transcriptional regulator, TetR family 


SAG1311 


198 


GTP-binding protein 


SAG1312 


408 


ATP-dependent Clp protease, ATP-binding subunit ClpX 


SAG1313 


56 


conserved hypothetical protein 


SAG1314 


164 


dihydrofolate reductase 


SAG1315 


279 


thymidylate synthase 


SAG1316 


390 


HMG-CoA synthase 


SAG1317 


427 


3-hydroxy-3-methylglutaryl-CoA reductase 


SAG1318 


149 


conserved hypothetical protein 


SAG1319 


214 


hemolysin III, putative 


SAG1320 


304 


conserved hypothetical protein TIGR00147 


SAG1321 


284 


glutathione S-transferase family protein, putative 


1 SAG1322 


72 


conserved domain protein 
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SAG1323 


331 


isopentenyl-diphosphate delta-isomerase 


SAG1324 


330 


phosphomevalonate kinase 


SAG1325 


314 


diphosphomevalonate decarboxylase ! 


SAG1326 


292 


mevalonate kinase, putative 


SAG1327 


409 


sensor histidine kinase 


SAG1328 


228 


DNA-binding response regulator 


SAG1329 


208 


GTP pyrophosphokinase family protein 


SAG1330 


68 


hypothetical protein 


SAG1331 


979 


R5 protein 


SAG1332 


146 


transcriptional regulator, MarR family, putative 


SAG1333 


690 


S'-nucleotidase family protein 


SAG1334 


136 


polypeptide deformylase, putative 


SAG1335 


449 


NADP-specific glutamate dehydrogenase 


SAG1336 


169 


membrane protein, putative | 


SAG1337 


589 


ABC transporter, ATP-binding/permease protein 


SAG1338 


579 


ABC transporter, ATP-binding/permease protein 


SAG1339 


157 


acetyltransferase, GNAT family 


SAG1340 


622 


ABC transporter, ATP-binding protein | 


SAG1341 


402 


polyA polymerase family protein 


SAG1342 


282 


DegV family protein 


SAG1343 


126 


protein* of unknown function 


SAG1344 


177 


hypothetical protein 


SAG1345 


164 


conserved hypothetical protein 


SAG1346 


654 


PTS system, fructose specific IIABC components 


SAG1347 


303 


1 -phosphofructokinase 


SAG1348 


247 


lactose phosphotransferase system repressor 


SAG1349 


411 


beta-lactam resistance factor 


SAG1350 


544 


surface antigen-related protein 


SAG1351 


307 


2-dehydropantoate 2-reductase, putative 


SAG1352 


356 


regulatory protein, putative 


SAG1353 


330 


pyridine nucleotide-disulphide oxidoreductase family protein 


SAG1354 


251 


tRNA (guanine-N 1 )-methyltransf erase 


SAG1355 


172 


16S rRNA processing protein RimM 


SAG1356 


503 


transcnptional regulator, RofA family 


SAG1357 


80 


KH domain protein 


SAG1358 


90 


ribosomal protein SI 6 


SAG 1359 


415 


permease, putative 


SAG1360 


236 


ABC transporter, ATP-binding protein 


SAG1361 


414 


conserved hypothetical protein 


SAG1362 


532 


carbamoyl-phosphate synthase, large subunit, putative 


SAG1363 


356 


carbamoyl-phosphate synthase, small subunit 


SAG1364 


173 


pynmiaine operon reguiaxory proiem 


SAG1365 


296 


ribosomal large subunit pseudouridine synthase, RluD subfamily 


SAG1366 


154 


lipoprotein signal peptidase 


SAG1367 


301 


transcriptional regulator, LysR family 


SAG1368 


94 


ribosomal protein L27 


SAG1369 


112 


conserved hypothetical protein I 


SAG1370 


104 


ribosomal protein L2l 



29 



«£»Uf *-& in- 
complete list of GBS predicted genes 



^ \JJt%JP 




Annotation 


SAG1371 i 


192 


conserved nypomeucai protein 


^AG1177 


404 


thiamine biosynthesis protein Thil 


SAG1171 I 


1R1 

-JO 1 


cysteine desulphurase 


<JA01174 


1 5fi 


conserved hypothetical protein 


<$AG1175 


440 


glutathione reductase 


<SAnil7£ 


111 
111 


conserved hypothetical protein 


QAG1177 


JOO 


chorismate synthase 


S1AG117R 


1^ 


3~dehydroquinate synthase 






3-dehydroquinate dehydratase 




lft< 


conserved hypothetical protein 


onVJUo 1 


T1 A 


sulfatase 




i in 


ribosomal protein L20 




oo 


ribosomal protein L35 




1 /o 


translation initiation factor BF-3 


SAG1385 


227 


cytidylate kinase 


SAG1386 


174 


conserved hypothetical protein 


SAG1387 


65 


ferredoxin, 4Fe-4S 


•SAG1388 


163 


conserved hypothetical protein 


o a 1 o on 

SAG1389 


406 


peptidase T 


oAG1390 


544 


polysaccharide biosynthesis protein, putative 


SSAG1391 


484 


UDP-N-acetylmiii^oyW 
ligase 


o a nt inn 


264 


iron compound ABC transporter, ATP-binding protein 




Ol A 

31U 


iron compound ABC transporter, substrate-binding protein 




341 


iron compound ABC transporter, permease protein 


Q Aril 


333 


iron compound ABC transporter, permease protein 


o/WJl ^2*0 




conserved hypothetical protein 


Q A 01107 


^1 1 
Oil 


inorganic pyrophosphatase, manganese-dependent 


SAG1 1051 

O/WJAJj/o 




pyruvate formate-lyase-activating enzyme 






CBS domain protein 




loo 


conserved hypothetical protein 


SAG1401 


11 1 


conserved hypothetical protein TIGR01212 


SAO1407 


71 1 

Z>1 J 


jr/vrz ramiiy proiein 


SAG1401 


104 
lyf 


memorane protein, putative 


SAG1404 


J I/O 


ecu wail suixacc alienor iamuy proiem 


SAG140S 


704 


soruisc lcumiy proiem 


SAG1406 


701 


c/M*toco 4*awii1vr m>A4ain 

soixose ioiniiy proiem 


SAG1407 


705 


ceii wan suxTa.ee ancnor iamuy proiem 


SAG140R 


Q01 


1 viral 1 oitfraoo on/tnm* +a*v%«1«» MvyN^niM 

ecu wail suixacc ancnor iamiiy protein 


SAG1409 


NA 


lUglJ 1/AULwAIi, aUUUCiiULf u OIX1 Colli 1L 


SAG1410 


170 


giycobyi iransierubc, group i iamiiy protein 


SAG1411 


282 


clycosyl transferase, grout) 2 familv nrotein 


SAG1412 


474 


polysaccharide biosynthesis protein 


SAG1413 


454 


membrane protein, putative 


SAG1414 


308 


glycosyl transferase, group 2 familyjprotein 


SAG1415 


311 


glycosyl transferase, group 2 family protein 


SAG1416 


352 


nucleotide sugar dehydratase, putative 


SAG1417 


240 


nucleotidyl transferase, putative 
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SAG1418 


274 


polysaccharide biosynthesis protein, putative 


SAG1419 


577 


lipoprotein, putative 


SAG1420 


117 


conserved hypothetical protein 


SAG1421 


243 


glycosyl transferase, group 2 family protein 


SAG1422 


313 


glycosyl transferase, group 2 family protein 


SAG1423 


384 


glycosyl transferase, putative 


SAG1424 


284 


dTDP-4-dehydrorhamnose reductase 


SAG1425 


113 


conserved hypothetical protein 


SAG1426 


369 


RNA polymerase sigma-70 factor 


SAG1427 


602 


DNA primase 


SAG1428 


125 


large conductance mechanosensitive channel protein 


SAG1429 


58 


ribosomal protein S21 


SAG1430 


167 


conserved hypothetical protein 


SAG1431 


268 


amino acid ABC transporter, amino acid-binding protein 


SAG1432 


347 


ammonium transporter family protein 


SAG1433 ! 


375 


conserved hypothetical protein 


SAG1434 


328 


rhodanese family protein 


SAG1435 


101 


conserved hypothetical protein 


SAG1436 


457 


glycerol-3-phosphate transporter, putative 


SAG1437 


55 


hypothetical protein 


SAG1438 


754 


glycogen phosphorylase 


SAG1439 


498 


4-alpha-glucanotransferase 


SAG1440 


342 


maltose operon repressor MalR, putative 


SAG1441 


415 


maltose/maltodextrin ABC transporter, maltose/maltodextrin- 
binding protein 


SAG1442 


456 


maltose ABC transporter, permease protein 


SAG1443 


278 


maltose ABC transporter, permease protein 


SAG1444 


490 


proton/peptide symporter family protein 


SAG1445 


NA 


MutT/nudix family protein, authentic frameshift 


SAG1446 


62 


hypothetical protein 


SAG1447 


441 


conserved hypothetical protein 


SAG1448 


502 


glycosyl transferase, group 1 family protein 


SAG1449 


795 


preprotein translocase SecA subunit, putative 


SAG1450 


330 


conserved domain protein 


SAG1451 


494 


conserved hypothetical protein 


SAG1452 


514 


conserved hypothetical protein 


SAG1453 


409 


preprotein translocase SecY family protein 


SAG1454 


398 


glycosyl transferase, putative 


SAG1455 


295 


glycosyl transferase, group 2 family protein 


SAG1456 


NA 


glycosyl transferase, family 8, degenerate 


SAG1457 


129 


IS1381, transposase OrfB 


O A 1 A CO 

SAG1458 


127 


IS 1 38 1 , transposase OrfA 


SAG1459 


413 


^glycosyl transferase family 8 


SAG1460 


401 


glycosyl transferase, family 8 


SAG1461 


335 


conserved hypothetical protein 


SAG1462 


970 


cell wall surface anchor family protein 


SAG1463 


NA 


transcriptional regulator, RofA family, authentic point mutation 


SAG1464 


663 


excinuclease ABC, B subunit 
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(a.a.) 


Annotation 


oAG1465 




protease, putative 


oAul400 


inn 


glutamine ABC transporter, glutamine-binding protein/permease 
protein 


o/vvjr i^fo / 


/.HO 


glutamine ABC transporter, Alr-binaing protein, GlnQ putative 


oAU 1 40o 


1 ID 


conserved hypothetical protein 






conserved hypothetical protein 


oAIJl4/U 


4 J / 


Cr 1 F-binding protein, GTPl/Obg family 


oAvjl4/l 


AO 
42 


conserved hypothetical protein 


oAIjj14/Z 


All 

413 


aminopeptidase PepS 


Q Am All 


192 


cell wall surface anchor family protein 


0ALI14/4 


OoU 


amidase family protein 


O A /""II /f7C 

2>Avj14/5 


240 


ribosomal small subunit pseudouridine synthase A 


oA*j147o 


280 


oxidoreductase, aldo/keto reductase family 


oAvj1477 


224 


mtroreductase family protein 


oAG1478 


130 


lactoylglutathione lyase 


oA014/y 


308 


glycosyl transferase, group 2 family protein 


oAG1480 


462 


amino acid permease 


bAG1481 


155 


SsrA-bmdmg protem 


SAG 1482 


801 


exonbonuclease, VacB/Rnb family 


•SALrl4o3 


78 


preprotem translocase, SecG subumt 


O A f* 1 A OA 

oAG1484 


48 


ribosomal protein L33 


SAG1485 


389 


multi-drug resistance protein 


C A /""II /t o<r 


CA O 

548 


membrane protein, putative 


oAG1487 


233 


A y4 A A m m turn. % • « • ■ 

ABC transporter, ATP bmdmg protem 


0 A f~±1 vfOQ 

oAG14oo 


195 


dephospho-CoA kinase 


Q A/T.1 /|OQ 
oAVJl4oy 


2/3 


fonnainidopyrimidine-DNA glycosylase 


CAnuon 

oAVJl4i/U 


282 


transcnptional regulator, MutR family 


<3 Aril A0 1 


MA 

53U 


hypothetical protein 




JO 


hypothetical protein 




00 


hypothetical protein 






nypotneticai protein 




! ft 1 
01 


CAAX amino terminal protease family protein 


SAG1496 


110 


hypothetical protein 


D/VVJ 1 *\y 1 




hypothetical protein 






hypothetical protein 




z>yy 


vji Jr-oinamg protein ura 


55 AOl 500 


1 io 


aiacyigiyceroi Kinase 




101 


conservea nypomencai protem 1 1GJKUUU43 


SAG1 509 


zoo 


lexracenomycin poiyKeuae synthesis O-metny ltransterase TcmP, 


SAG1503 


jy 


uypuuicuuai pruiciii 


SAG1504 


38 


hypothetical protein 


SAO1505 


158 


MutT/nudix family protein 


SAG1506 


267 


hypothetical protein 


SAG1507 


345 


PhoH family protein 


SAG1508 


590 


67 kDa Myosin-crossreactive streptococcal antigen 


SAG1509 


71 


conserved hypothetical protein 


SAG1510 


169 


peptide methionine sulfoxide reductase 
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SAG1511 


284 


conserved nypotneucai protein 


SAG1512 


185 


ribosome recycling factor 


SAG1513 


242 


uridylate kinase 


SAG1514 


226 


peptide ABC transporter, A 1 r-Dinding protein 1 


SAG1515 


262 


J AT>/"^ a — 1 , A ' I "D t«««* «««uv4a«m 

peptide ABC transporter, A l r-Dinaing protein 


SAG1516 


255 


peptide ABC transporter, permease protein 


SAG1517 


314 


peptide ABC transporter, permease protein 


SAG1518 


538 


peptide ABC transporter, peptide-binding protein 


SAG1519 


229 


nbosomal protem LI 


SAG 1520 


141 


•t i Til 

nbosomal protem L 1 1 


SAG1521 


388 


transposase, IS30 fiamily, putative 


SAG1522 


460 


transporter, major facilitator family 


SAG1523 


404 


peptidase, 1VQ0/M25/M40 family 


SAG1524 


294 


transcriptional regulator, LysR family 


SAG1525 


117 


conserved hypothetical protein 


SAG1526 


178 


IS861 , transposase OrfA j 


SAG1527 


277 


IS861, transposase OrfB 


SAG1528 


571 


chorismate binding enzyme ! 


SAG1529 I 


816 


FtsK/SpoIIIE family protein 


SAG1530 


267 


peptidyl-prolyl cis-trans isomerase, cyclophilin-type 


SAG1531 


277 


manganese ABC transporter, permease protein 


SAG1532 


238 


manganese ABC transporter, ATP-binding protein | 


SAG1533 


308 


manganese ABC transporter, manganese-binding adhesion 
liprotein 


SAG1534 


215 


iron-dependent transcnptional regulator 


SAG1535 


229 


5-methylthioadenosinenucleosidase/S-adenosylhomocysteine 
nucleosidase 


SAG1536 


89 


conserved hypothetical protem 


SAG1537 


184 


MutT/nudix family protein 


SAG1538 


459 


UDP-N-acetylglucosamine pyrophosphorylase 


SAG1539 


31 


hypothetical protein 


O A /"<t f if if A 

SAG1540 


137 


conserved hypothetical protein 


SAG1541 


125 


glyoxalase family protein 


SAG1542 


318 


oxidorecuctase, oto/lcui/ moca tamiiy j 


SAG1543 


XT A 

INA 


conserved nypoineucai protein, autnentic iramesnixt 


SAG1544 


232 


gluconate D-denyarogenase, putative 


oA<jt134j 


TO 

/o 


conserved nypotnencai protem 




oo 


conserved nypotneucai protem 


oA01D4/ 


loo 


acetyitransierase, vjin/v i iamny 


oALrl j4o 


4ZZ 


glycosyi transierase, group z ianiiiy proiem 


oAVJlD4y 


1 07 

12/ 


lo 1 3o 1 , transposase ijri/v 


SAG1550 


129 


TS 1 3 8 1 transnosase OrfB 


SAG1551 


67 


hypothetical protein 


SAG1552 


i 719 


conserved hypothetical protein 


SAG1553 


477 


hypothetical protein 


SAG1554 


225 


hypothetical protein 


SAG1555 


231 


hypothetical protein 


SAG1556 


445 


branched-chain amino acid transport system II carrier protein 



33 



TableTT: Com pie t list of GBS predicted genes 
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A - . • 

Annotation 


SAG1557 


665 


methionyl-tRNA synthetase 


SAG 1558 


291 


tellurite resistance protein TehB 


SAG1559 


23 1 


membrane protein, putative 


SAG 1560 


40 


hypothetical protein 


SAG1561 


405 


PTS system, IIC component, putative 


SAG 1562 


280 


conserved hypothetical protein i 


SAG1563 


275 


exodeoxyribonuclease 


SAG1564 


1 18 


conserved hypothetical protein 


SAG1565 


158 


methylated-DNA--protem-cysteme S-methyltransferase 


SAG1566 


393 


D-isomer specific 2-hydroxyacid dehydrogenase family protein 


SAG1567 


182 


acetyltransferase, GNAT family 


SAG1568 


NA 


phosphoserine aminotransferase, authentic frameshift 


SAG1569 


211 


copper homeostasis protein CutC, putative 


SAG1570 


34 


conserved hypothetical protein 


SAG1571 


53 


hypothetical protein 


SAG1572 


287 


tetrapyirole methylase family protein 


SAG1573 


108 


conserved hypothetical protein 


SAG1574 


287 


DNA polymerase HI, delta prime subunit, putative 


SAG1575 


211 


fhymidylate kinase 


SAG1576 


267 


transposase, IS30 family, putative, truncation 


SAG1577 


219 


AcuB family protein 


SAG1578 


236 


branched-chain amino acid ABC transporter, ATP-binding protein 


SAG1579 


254 


branched-chain amino acid ABC transporter, ATP-binding protein 


SAG1580 


317 


branched-chain amino acid ABC transporter, permease protein 


SAG1581 


289 


branched-chain amino acid ABC transporter, permease protein 


SAG1582 


388 


branched-chain amino acid ABC transporter, amino acid-binding 
protein 


SAG1583 


81 


conserved hypothetical protein 


SAG1584 


377 


IS 1548, transposase 


SAG1585 


196 


ATP-dependent Clp protease, proteolytic subumt ClpP 


SAG1586 


209 


uracil phosphonbosyltransferase 


SAG1587 


H OA 

389 


aminotransferase, class I 


O A COD 


182 


RNA methyltransferase, TrmH family, group 2 


bA(J15©9 


450 


amino acid permease, putative 




/MO 

449 


potassium uptake protein, Trk family 


c Ant COI 


475 


cation uptake protein, Trk family 


c a m coo 


oi 


conserved nypotnetical protein 11CjK0027o 


q Aril <qi 


Z4U 


ribosomal large subunit pseudouridine synthase B 


q a m CCM 


1V4 


conserved nypotnetical protein IHjKUUzoI 


Q Aril coc 


ZJ5 


conserved hypothetical protein 


Q A rii COA 


Z40 


integrase/recombinase, phage integrase family 


SAG1597 


157 

A *S / 


UV/AliCAXXi. JJJL \J kvl 11 


SAG1598 


173 


conserved hypothetical protein 


SAG1599 


324 


HAM1 protein 


SAG1600 


264 


glutamate racemase 


SAG1601 


79 


conserved hypothetical protein 


SAG1602 


180 


membrane protein, putative 


SAG1603 


173 


transcriptional regulator, biotin repressor family 
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SAG1604 


229 


membrane protein, putative 


SAG1605 


167 


conserved hypothetical protein 


SAG1606 


247 


RNA methyltransferase, TrmH family 


SAG1607 


92 


acylphosphatase 


SAG1608 | 


310 


lipoprotein, putative 


SAG1609 


221 


amino acid ABC transporter, permease protein 


SAG1610 


285 


amino acid ABC transporter, substrate-binding protein 


SAG1611 


486 


amidase family protein j 


SAG1612 


160 


transcription elongation factor GreA 


SAG1613 


600 


conserved hypothetical protein 


SAG1614 


167 


acetyltransferase, GNAT family 


SAG1615 


443 


UDP-N-acetylmuramate— alanine ligase 


SAG1616 


205 


conserved hypothetical protein 


SAG1617 


32 


hypothetical protein 


SAG1618 


1032 


Snf2 family protein 


SAG1619 


377 


IS 1548, transposase 


SAG1620 


436 


phosphoglycerate dehydrogenase-related protein 


SAG1621 


300 


primosomal protein Dnal 


SAG1622 


391 


conserved hypothetical protein 


SAG1623 


159 


conserved hypothetical protein TIGR00244 


SAG1624 


501 


sensor histidine kinase CsrS 


SAG1625 


229 


DNA-binding response regulator CsrR 


SAG1626 


177 


conserved hypothetical protein 


SAG1627 


296 


heat shock protein HtpX 


SAG1628 


184 


lemA protein 


SAG1629 


237 


glucose-inhibited division protein B 


SAG1630 


459 


sodium transport family protein 


SAG1631 


223 


potassium uptake protein, Trk family, putative 


SAG1632 


276 


cobalt transport family protein 


SAG1633 


558 


ABC transporter, ATP-binding protein 


SAG1634 


212 


conserved hypothetical protein 


SAG1635 


402 


sodium: dicarboxy late symporter family protein 


SAG1636 


455 


branched-chain amino acid transport system II carrier protein 


SAG1637 


351 


alcohol dehydrogenase, zinc-containing 


SAG1638 


230 


ABC transporter, permease protein 


SAG1639 


356 


ABC transporter, ATP-binding protem 


SAG1640 


458 


peptidase, M20/M25/M40 family 


SAG1641 


274 


YaeC family protein 


SAG1642 


277 


A m J"* * A "t A A 1 • 1 * a * 

ABC transporter, substrate-binding protein 


SAG1643 


229 


glutamine amidotransferase, class I 


SAG1644 


37 


hypothetical protem 


SAG1645 


i O 7 ") O 

! 238 


conserved hypothetical protem TIGR01033 


SAG1646 


32 


hypothetical protein 


SAG1647 


328 


dihydroxyacetone kinase family protein 


SAG1648 


178 


transcriptional regulator, TetR family, putative 


SAG1649 


37 


hypothetical protein 


SAG1650 


329 


dihydroxyacetone kinase family protein 


SAG1651 


192 


dihydroxyacetone kinase family protein 
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SAG 1 652 


124 


conserved hypothetical protein 


SAG1653 


237 


glycerol uptake facilitator protein 


SAG 1654 


1 34 


conserved hypothetical protein 


SAG1655 


237 


transcriptional regulator, MerR family 


SAG1656 


369 


conserved hypothetical protein 


SAG1657 


83 


hypothetical protein 


SAG1658 


244 


conserved hypothetical protein 


SAG 1659 


118 


iojap-related protein 


SAG 1660 


173 


isochorismatase family protein 


CI A ^14 J* f -i 

SAG1661 


195 


conserved hypothetical protein TIGR00488 


SAG1662 


210 


conserved hypothetical protein TIGR00482 


SAG1663 


105 


conserved hypothetical protein TIGR00253 


SAG1664 


372 


GTP-binding protein 


SAG1665 


177 


hydrolase, haloacid dehalogenase-like family 


SAG1666 


304 


membrane protein, putative 


SAG1667 


480 


glutamyl-tRNA(Gln) amidotransferase, B subunit 


SAG1668 


488 


glutamyl-tRNA(Gin) amidotransferase, A subunit 


SAG1669 


100 


glutamyl-tRNA(Gln) amidotransferase, C subunit 


SAG1670 


881 


pyruvate phosphate dikinase 


SAG1671 


276 


protein of unknown function 


SAG1672 


170 


CBS domain protein 


SAG1673 


321 


3-hydroxyacyl-CoA dehydrogenase family protein 


SAG1674 


182 


isochorismatase family protein 


SAG1675 


261 


transcriptional regulator CodY, putative 


SAG1676 


403 


aminotransferase, class I 


t*% a .#""1 s~ 

SAG1677 


150 


conserved hypothetical protein 


SAG1678 


460 


hydrolase, haloacid dehalogenase-like family 


SAG1679 


320 


asparaginase family protein 


SAG1680 


292 


shikimate 5-dehydrogenase 


SAG1681 


304 


oxidoreductase, aldo/keto reductase family 


SAG 1682 


67 1 


ATP-dependent DNA hehcase RecG 


SAG1683 


512 


immunogenic secreted protein, putative 


SAG 1684 


366 


* • 

alanine racemase 


C* A 1 HOC 

SAG1685 


119 


holo-(acyl-carner-protem) synthase 


O A #"11 tZQ/Z 

oAOlooo 


335 


— 1- 1 - _ O 1 1_ J *"> J 1 A. A 11 1 

phospno-2-dehydro-3-deoxyheptonate aldolase 


oAvjioo/ 


i 842 


preprotein translocase, SecA subunit 


O A /Til /COO 

oAvjrlooo 


315 


mannose-6-phosphate isomerase, class I 




zy3 


fructokinase 






DTO __ , ,-.-§■ . . . i, TT A Dc"** a _ 

rib system, IIABC components 




A TO 


sucrose-6-phosphate hydrolase 




3ZU 


sucrose operon repressor ScrR 


SAG 1693 


144 


in uLLu^duoii suusuuice protein o 


SAG1694 


129 


conserved hypothetical protein 


SAG1695 


186 


translation elongation factor P 


SAG1696 


38 


hypothetical protein 


SAG1697 


48 


hypothetical protein 


SAG1698 


99 


conserved hypothetical protein 


SAG1699 


30 


hypothetical protein 
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. _ _ . _ — 


SAG1700 


76 


hypothetical protein 


SAG1701 


56 


hypothetical protein 


SAG1702 


41 


hypothetical protein 


SAG1703 


54 


hypothetical protein 


SAG1704 


150 


cytidine/deoxycytidylate deaminase family protein 


SAG1705 


NA 


peptidase, M24 family, authentic point mutation 


SAG1706 | 


238 


conserved hypothetical protein 


SAG1707 


499 


drug resistance transporter, EmrB/QacA family 


SAG1708 


38 


hypothetical protein 


SAG1709 


942 


excinuclease ABC, A subunit 


SAG1710 


223 


conserved hypothetical protein 


SAG1711 


314 


magnesium transporter, CorA family 


SAG1712 


79 


ribosomal protein S 1 8 


SAG1713 


163 


single-strand binding protein 


SAG1714 


95 


ribosomal protein S6 


SAG1715 


374 


A/G-specific adenine glycosylase 


SAG1716 


197 


transcriptional regulator, Cro/CI family 


SAG1717 


104 


thioredoxin 


SAG1718 


166 


PAP2 family protein 


SAG1719 


779 


MutS2 family j>rotein 


SAG1720 


180 


conserved hypothetical protein 1 


SAG1721 


103 


conserved hypothetical protein 


SAG1722 


297 


ribonuclease Hm 


1 SAG1723 


197 


signal peptidase I 


SAG1724 


806 


helicase, putative 


SAG1725 


160 


conserved hypothetical protein 


SAG1726 


364 


DNA-damage-inducible protein P 


SAG1727 


770 


formate acetyltransferase 


SAG1728 


124 


FMN-binding protein 


SAG1729 


309 


conserved hypothetical protein ! 


SAG1730 


251 


conserved hypothetical protein 


SAG1731 


298 


membrane protein, putative 


SAG1732 


282 


glycerol uptake facilitator protein, putative 


SAG1733 


150 


universal stress protein family 


SAG1734 


400 


transporter, putative 


SAG1735 


219 


transcriptional regulator, Crp/Fnr family 


SAG1736 


761 


X-pro dipeptidyl-peptidase 


SAG1737 


119 


hypothetiqal protein 


SAG1738 


326 


polyprenyl synthetase family protein 


SAG1739 


582 


ABC transporter, ATP-binding protem CydC 


SAG1740 


572 


ABC transporter, ATP-binding protem CydD 


1 O A P1 HA 1 

oAVjrl /41 




cyxocnrome u uoivjuinui uaiuosc, auuiuui. xx 


SAG1742 


475 


cytochrome d oxidase, subunit I 


SAG1743 


402 


pyridine nucleotide-disulphide oxidoreductase family protein 


SAG1744 


299 


prenyltransferase, UbiA family 


SAG1745 


148 


hypothetical protein 


SAG1746 


35 


hypothetical protein 


1 SAG1747 


99 


conserved hypothetical protein HGR00103 
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SAG1748 


396 


cyclopropane-fatty-acyl-pnosphohpid synthase 


SAG1749 | 


241 


transcriptional regulator, merR family 


SAG1750 


195 


exonuclease 


SAG1751 


178 


conserved hypothetical protein 


SAG1752 


390 


conserved hypothetical protein TIGR00275 


SAG1753 


260 


conserved hypothetical protein 


SAG1754 


89 


ribosomal protein S14 


SAG1755 


38 


hypothetical protein 


SAG1756 


341 


conserved hypothetical protein 


SAG1757 


336 


O-sialoglycoprotein endopeptidase family protein j 


SAG1758 


135 


ribosomal-protein-alanine acetyltransferase, putative 


SAG1759 


230 


protein of unknown function 


SAG1760 


76 


conserved hypothetical protein 


SAG1761 


559 


metallo-beta-lactamase superfamily protein 


SAG1762 


169 


conserved hypothetical protein 


SAG1763 


448 


glutamine synthetase, type I 


SAG1764 


123 


transcriptional regulator GlnR 


SAG1765 


179 


conserved hypothetical protein 


SAG1766 


398 


phosphoglycerate kinase 


SAG1767 


289 


acid phosphatase 


SAG1768 


336 


glyceraldehyde 3-phosphate dehydrogenase ! 


SAG1769 


692 


translation elongation factor G 


SAG1770 


156 


ribosomal protein S7 


SAG1771 


137 


ribosomal protein S12 


SAG1772 


i 270 


pur operon repressor 


SAG1773 


313 


HD domain protein 


SAG1774 


424 


conserved hypothetical protein 


SAG1775 


210 


conserved hypothetical protein 


SAG1776 


220 


ribulose-phosphate 3-epimerase 


SAG1777 


290 


conserved hypothetical protein TIGR00157 


SAG1778 


283 


rRNA (guanine-N 1 -)-methyltransferase, putative 


SAG1779 


290 


dimethyladenosine transferase 


SAG1780 


163 


hypothetical protein 


SAG1781 


186 


primase-related protein 


SAG1782 


260 


deoxyribonuclease, TatD family 


SAG1783 


90 


hypothetical protein 


SAG1784 


130 


hypothetical protein 


SAG1785 


430 


hypothetical protein 


SAG1786 


130 


protein of unknown function 


SAG1787 


420 


dltD protem 


SAG1788 


79 


D-alanyl carrier protem 


<s a n\ too 




altt> protem 


SAG1790 


511 


D-alanine-activating enzyme 


SAG1791 


395 


sensor histidine kinase 


SAG1792 


224 


DNA-binding response regulator 


SAG1793 


44 


ribosomal protein L34 


SAG1794 


451 


membrane protein, putative 


SAG1795 


388 


transposase, IS30 family, putative 
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o xni nc\£. 


<T^ 


amino acid AdL transporter, permease protein 


oAOlyy / 


/1AT 


amino acid AJ3C transporter, Air- binding protein 


OAni TOQ 

oAVJl /yo ! 


jy 


nypouieucai protein > 


Q A nt TOO 

o/vlji /yy j 


TOO 

/yz 


xylulose-5 -phosphate/fructose-6-phosphate phosphoketolase 


oA\JloUU 




conserved hypothetical protein 


QAH1 DAI 

0AOI0UI 


jDy 


transcriptional antiterminator, BglG family 


oAOloUZ 


t^q> 


conserved hypothetical protein 


oA<jt1803 


DUD 


carbohydrate kinase, FGGY family 


&AG1804 


000 

3Zy 


hypothetical protein 


Q a £~I1 Q AC 

oAO1805 


483 


PTS system, IIC component, putative 


C a i^ii qa/c 
oAOloOo 


318 


glyoxylate reductase, NADH-dependent 


C A /"II OAT 


33y 


hypothetical protein 


oAGl 808 


327 


sugar binding transcriptional regulator, LacI family 


C A 1 OA A 

oAG1809 


215 


transaldolase family protein 


O A 1 Ol A 

OAG1810 


238 


carbohydrate isomerase, AraD/FucA family j 


&AG1811 


287 


hexulose-6-phosphate isomerase, putative \ 


bAG1812 


221 


hexulose-6-phosphate synthase, putative 


SAG1813 


161 


PTS system, IIA component 


SAG1814 


92 


PTS system, IIB component 


SAG1815 


479 


transport protem SgaT, putative 


SAG1816 


205 


jTHpothetical protem 


S>AG1817 


157 


hypothetical protein 


O A ft 1010 

0AGI8I8 


430 


adenylosuccinate synthetase 


0AGI8I9 


340 


jperfiingolysm O regulator protein 


O A 1 OO A 

DAG1820 


224 


conserved hypothetical protein 


oAuloZl 


750 


glutamate— cysteine ligase/amino acid ligase, putative 


oAvrloZZ 


2/2 


protein of unknown function . 


oAtjloZ3 


418 


protein of unknown function 


QAni 004 
oAU l oZ*f 


toi 

zy 1 


chaperonin, 33 kDa 


CArjl QTC 
uaUIoZj 


3ZD 


iNiixv3/ommi iamiiy protein 


OAVJ 1 OZO 


T1 ^ 


deoxynucleoside kinase family protein 


QAni RTT 
O/WJT 1 OZ / 




phosphinothncin N-acetyltransferase 




1 £1 ^ 


a 1 Jr-aepenaeni v^ip protease, Air- Dincnng suoumt 


OAUI0Z7 


ID*! 


uaiiscripuonai regulator v^tsK. 


Or\vJ i Oj v 




conserveu nypouieucai protem 






xransiauon eiongauon iactor 1 s 


OnVJlOJZ 




riDObonTai protein oz l 




1 8 a* 


aiKyi xiyaxoperoxiuc reuuciase, suounii \~> 


SAG1814 


sio 


au&yi xiyuropcroAiue rcuuciase, suDunii r 


SAG1835 


1 


uuiiocr vcti iiypuuicucax proicm 




61 
UI 


cuubcrvcu nypuuicucai protein 


SAG1837 


! 468 


oronhaffe LambdaSa2 lv^in nutative 


SAG1838 


109 


prophage LambdaSa2, holin, putative 


SAG1839 


136 


conserved hypothetical protein 


SAG1840 


! 112 


hypothetical protein 


SAG1841 


76 


conserved domain protein 


SAG1842 


1224 


prophage LambdaSa2, PblB, putative 


SAG1843 


240 


conserved hypothetical protein 
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SAG 1844 




conserved hypothetical protein 


SAG 1 845 


42 


hypothetical protein 


SAG1846 


1 CO 

158 


hypothetical protein 


SAG1847 


227 


conserved hypothetical protein 


SAG1848 


114 


conserved hypothetical protein 


SAG1849 


115 


hypothetical protein 


SAG 1850 


101 


hypothetical protein 


SAG1851 


111 


conserved domain protein 


SAG1852 


420 


conserved domain protein 


SAG1853 


180 


prophage LambdaSa2, protease, putative 


SAG1854 


380 


conserved hypothetical protein 


SAG 1 855 


570 


prophage LambdaSa2, terminase large subnnit, putative 


SAG1856 


I6l 


hypothetical protein 


SAG1857 


119 


prophage LambdaSa2, HNH endonuclease family protein 


SAG1858 


95 


hypothetical protein 


SAG1859 


180 


prophage LambdaSa2, site-specific recombinase, phage integrase 
family 


SAG1860 


154 


conserved hypothetical protein 


SAG1861 


119 


prophage LambdaSa2, transcriptional regulator, Cro/CI family 


SAG1862 


86 


hypothetical protein 


SAG1863 


138 


prophage LambdaSa2, single-strand binding protein 


SAG1864 


68 


hypothetical protein 


SAG1865 


74 


conserved hypothetical protein 


SAG1866 


109 


conserved hypothetical protein 


SAG1867 


163 


conserved hypothetical protein 


SAG1868 


134 


hypothetical protein 


SAG1869 


437 


prophage LambdaSa2, type EL DNA modification 
methyltransferase, putative 


SAG1870 


273 


prophage LambdaSa2 s DNA replication protein DnaC, putative 


SAG1871 


248 


prophage LambdaSa2, bacteriophage replication 
protein/hypothetical protein, truncation/fusion 


oALrlo72 


200 


hypothetical protein 


C? A r'l OT5 

SAOlo/3 


A A*) 

443 


prophage LambdaSa2, rephcative DNA hehcase 


oA01o74 


87 


hypothetical protein , 


c a cv\ o*7c 


y4 


conserved hypothetical protein 


C A C\\ QH4Z 


176 


prophage LambdaSa2, HNH endonuclease family protein 


G A /"II QT7 


2 Jo 


prophage LambdaSa2, antirepressor protein, putative 


oAulo/o 


i no 
1U2 


conserved domain protein 


SAG1879 


156 


hypothetical protein 


o A OOA 


CA 

54 


hypothetical protein 


oiVOXool 




hypothetical protein 


SAG1882 




propxiagc j^aruDucioct^, repressor protein, puictuve 


SAG1883 


128 


conserved hypothetical protein 


SAG1884 


134 


hypothetical protein 


SAG188S 


356 


prophage LambdaSa2, site-specific recombinase, phage integrase 
family 


SAG1886 


32 


hypothetical protein 


SAG1887 


689 


Na+/H+ exchanger family protein 
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SAG1888 


78 


hypothetical protein 


SAG1889 | 


317 


microcin immunity protein MccF, putative 


SAG1890 


631 


endopeptidase O 


SAG1891 


327 


oxidoreductase, Gfo/Idh/MocA family 


SAG1892 


358 


membrane protein, putative 


SAG1893 


59 


hypothetical protein 


SAG1894 


214 


cyclic nucleotide-binding domain protein 


SAG1895 


204 


polypeptide deformylase 


SAG1896 


333 


sugar binding transcriptional regulator RegR 


SAG1897 \ 


634 


conserved hypothetical protein 


SAG1898 


271 


PTS system, IK) component 


SAG1899 


288 


PTS system, IIC component 


SAG1900 


164 


PTS system, IIB component 


SAG1901 j 


398 


glucuronyl hydrolase 


SAG1902 


144 


PTS svstem, IIA component 


SAG1903 


34 


hypothetical protein 


SAG1904 


270 


oxidoreductase, short-chain dehydrogenase/reductase family 


SAG1905 ! 


212 


conserved hypothetical protein 


SAG1906 


335 


carbohydrate kinase, PfkB family 


SAG1907 


212 


2-dehydro-3-deoxyphosphogluconatealdolase/4-hydroxy-2- 
oxoglutarate aldolase 


SAG1908 


499 


hypothetical protein 


SAG1909 


204 


nitroreductase family protein 


SAG1910 


| 141 


transcriptional regulator, MarR family 


SAG1911 


| 1468 


DNA polymerase III, alpha subunit, Gram-positive type 


SAG1912 


194 


N-acetylmuramovl-L-alanine amidase, family 4 protein 


SAG1913 


617 


prolyl-tRNA synthetase 


SAG1914 


419 


membrane-associated zinc metalloprotease, putative 


SAG1915 


i 264 


phosphatidate cytidylyltransferase 


SAG1916 


250 


undecaprenyl diphosphate synthase 


SAG1917 


113 


preprotein translocase, YajC subunit 


SAG1918 


L 114 


bacteriocin transport accessory protein, putative 


SAG1919 


387 


malate oxidoreductase 


SAG1920 


445 


citrate earner protein, CCS family 


SAG1921 


508 


sensor histidine kinase 


SAG1922 


S 229 


response regulator 


SAG1923 


j 331 


UDP-glucose 4-epimerase 


SAG1924 


535 


glucan 1,6-alpha-glucosidase 


SAG1925 


377 


sugar ABC transporter, ATP-binding protein 


SAG1926 


i 283 


helix-turn-helix domain protein, fis-type 


SAG1927 


1 298 


lacX protein 




! io« 
1 JZD 


XclKcllOoC 1,0— lJJ|JLWJoJp»i».iw alUUKUw i 


SAG1929 


! 310 


tagatose-6-phosphate kinase 


SAG1930 


! 171 


galactose-6-phosphate isomerase, LacB subunit 


SAG1931 


1 141 


galactose-6-phosphate isomerase, LacA subumt 


SAG1932 


! 816 


neuraminidase-related protein 


SAG1933 


482 


PTS system, IIC component, putative 


SAG1934 


j 101 


PTS system, IIB component, putative 
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SAG1935 


157 


. rl b system, UA component, putative 


A A f~\ ~t An A* 

SAG1936 


258 


actose phosphotransferase system repressor 


SAG1937 


XT A 

NA 


streptococcal histidine triad family protein, degenerate 


A A A*t An ft 

SAG1938 


n at 

307 


adhesion lipoprotein 


SAG1939 


147 


protem oi unknown function liOKUU25o 


ft A « a a a 

SAG1940 


738 


GTP pyrophosphokinase family protein 


A A A"^ 1 A >t 1 

SAG1941 


O A A 

800 


2\3*-cyclic-nucleotide 2* -phosphodiesterase 


A A At -* A .4 A 

SAG1942 


I5l 


nrdl protem I 


SAG1943 


345 


conserved hypothetical protein 


SAG1944 


165 


conserved hypothetical protem j 


a— ■ a ^ Ank a ^» 

SAG1945 


345 


iron ABC transporter, iron-binding protem j 


SAG1946 


257 


T\\f At* t * ■ i 

DNA-binding response regulator 


SAG1947 


549 


conserved hypothetical protein 


SAG1948 


275 


PTS system, HD component 


SAG1949 


269 


PTS system, EEC component 


SAG1950 


163 


PTS system, HB component 


SAG1951 


I4l 


PTS system, HA component, putative . 


SAG1952 


353 


membrane protein, putative 


SAG1953 


60 


hypothetical protein 


SAG1954 


384 


membrane protein, putative 


SAG1955 


282 


ABC transporter, ATP-bmding protem 


SAG1956 


96 


conserved hypothetical protein, truncation 


SAG1957 


250 


response regulator 


SAG1958 


276 


conserved hypothetical protein 


SAG1959 


727 


PTS system, HABC components 


SAG 1960 


551 


sensor histidine kinase 


SAG1961 


225 


phosphate regulon response regulator PhoB 


SAG1962 


218 


phosphate transport system regulatory protein PhoU, putative 


ft A An ■% A A* A 

SAG1963 


253 


phosphate ABC transporter, ATP-binding protem 


A A A"t <g A A* if 

SAG1964 


AAA 

292 


phosphate ABC transporter, permease protein 


A A ■« A A*/" 

SAG1965 


A O 1 

281 


phosphate ABC transporter, permease protein 


A A /"I 1 A A* A* 

SAG1966 


A A/> 

293 


hemolysin precursor, putative 


A A 1 A A"T 

SAG1967 


1 AC 

195 


hypothetical protein 


SAG1968 


A >| A* 

246 


conserved nypotnetical protem l i(jrKUUU4o 


o a /""* i fi/cn 
oACj19o9 


31 / 


ribosomal protein LI 1 methyltransferase 






cons erven nypomeucai protem 


OATI CV71 


41 


hypothetical protein 


O A Z^ 1 1 AT> 


23 o 


transcriptional regulator, Merit tamuy 


G A CIA (V71 


1 DO 


acetyitransierase, oxn/vi iamny 


O A m 0*7/1 


1DZ 


iviuti/numx iamuy protein 


C A fll C7< 

oALrly /3 


4/ 


nypouieticai proxem 


c AG1 976 

kJxVVJ A .A / VI 


156 


conserved hvnothetical nrotein 


SAG1977 


163 


acetyltransferase, GNAT family 


SAG1978 


422 


ATPase, AAA family 


SAG1979 


253 


membrane protein, putative 


SAG1980 


300 


ABC transporter, ATP-binding protein 


SAG1981 


68 


hypothetical protein 


SAG1982 


359 


transcriptional regulator, Cro/CI family 
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SAG1983 


105 « 


: 

conserved hypothetical protein 


SAG1984 


188 i 


conserved hypothetical protein TIOR00730 


SAG1985 


51 


hypothetical protein 


SAG1986 


375 


site-specific recombinase, phage integrase family 


SAG1987 


61 


conserved hypothetical protein 


SAG1988 


342 


conserved hypothetical protein 


SAG1989 


139 


hypothetical protein 


SAG1990 


127 


hypothetical protein 


SAG1991 


204 


transcriptional regulator, Cro/CI family 


SAG1992 


518 


protein of unknown function 


SAG1993 


373 


site-specific recombinase, phage integrase family 


SAG1994 


108 


conserved hypothetical protein 


SAG1995 


210 


hypothetical protein 


SAG1996 


263 


cell wall surface anchor family protein, putative 


SAG1997 


182 


hypothetical protein 


SAG1998 


457 


hypothetical protein 


SAG1999 


47 


hypothetical protein 


SAG2000 


666 


membrane protein, putative 


SAG2001 


756 


conjugal transfer protein, interruption-C 


SAG2002 


129 


IS 1381, transposase OrfB 


SAG2003 


127 


1S1381, transposase OrfA 


SAG2004 


67 1 


conjugal transfer protein, interruption-N 


SAG2005 


136 


conserved hypothetical protein 


SAG2006 


88 


conserved hypothetical protein 


SAG2007 


317 


conserved hypothetical protein 


SAG2008 


84 


conserved hypothetical protein 


SAG2009 


88 


I conserved hypothetical protein 


SAG2010 


157 


hypothetical protein 


SAG2011 


160 


1 conserved hypothetical protein 


SAG2012 


90 


hypothetical protein 


SAG2013 


189 


hypothetical protein 


SAG2014 


449 


hypothetical protein 


SAG2015 


99 


transcriptional regulator, Cro/CI family 


SAG2016 


125 


hypothetical protem 


SAG2017 


429 


transcnptional regulator, Cro/CI iamily 


SAG2018 


553 


1 FtsK/SpoIIIE family protem 


SAG2019 


153 


1 hypothetical protein 


SAG2020 


98 


1 hypothetical protein 


SAG2021 


826 


I cell wall surface anchor family protein 


SAG2022 


417 


I . _ ^ TOT *y fr. Hit 

transposase, ISL3 family 


SAG2023 


546 


1 mercuric reductase 




Uv 


1 mercuric resistance ooeron regulatory protein MerR 


SAG2025 


522 


Mn2+/Fe2+ transporter, NRAMP family 


SAG2026 


240 


membrane protein, putative 


SAG2027 


205 


ABC transporter, ATP-binding protein 


SAG2028 


36 


conserved hypothetical protein 


SAG2029 


284 


streptomycin resistance protein 


1 SAG2030 


130 


) \ hypothetical protein 
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SAG2031 


202 


lypothetical protein 


SAG2032 


111 


conserved hypothetical protein 


SAG2033 


162 


acetyltransferase, GNA1 tamily 


SAG2034 


247 


membrane protein, putative 


SAG2035 


300 


ABC transporter, ATP-binding protein 


SAG2036 


68 


hypothetical protein 


SAG2037 


358 


transcriptional regulator, Cro/CI tamily 


SAG2038 


204 


P AP2 family protein 


SAG2039 


98 


conserved hypothetical protein 


SAG2040 


186 


conserved hypothetical protein TIGR00730 


SAG2041 


287 


protease, putative 


SAG2042 


100 


rhodanese family protein 


SAG2043 


255 


cAMP factor 


SAG2044 


62 


hypothetical protein 


SAG2045 


179 


DNA topoloev modulation protein FlaR, putative 


SAG2046 


361 


elvcerol dehydrogenase, putative 


SAG2047 


235 


conserved hypothetical protein 


SAG2048 


614 


5-methyltetrahydrofolate--homocysteinemelhyltransferase, 
putative 


SAG2049 


745 


5-methylteti^ydropteroyltriglutamate--homocysteine 
methyltransferase 


SAG2050 


107 


conserved hypothetical protein 


SAG2051 


230 


branched-chain amino acid transport protein AzlC, putative 


SAG2052 


41 


hypothetical protein 


SAG2053 


1570 


serme protease, subtilase family, putative 


SAG2054 


228 


DNA-bmdmg response regulator 


SAG2055 


462 


sensor histidine kinase 


SAG2056 


202 


chromosome assembly-related protein 


SAG2057 


833 


leucyl-tRNA synthetase 


SAG2058 


415 


major facilitator family protein 


SAG2059 


281 


protem of unknown function 


SAG2060 


398 


glycnsyl transferase, family 8 


SAG2061 


401 


glycosyl transferase, family 8 


SAG2062 


179 


transcription antitermination protein NusG 


SAG2063 


630 


pathogenicity protein, putative 


SAG2064 


57 


preprotein translocase, SecE subunit, putative 


SAG2065 


50 


ribosomal protein L33 


SAG2066 


773 


penicillin-binding protein 2A 


SAG2067 


294 


ribosomal large suounit pseuoounaine synuiabe, isjul^ buuiomuy 


SAG2068 


546 


conserved hypothetical protein 


SAG2069 


403 


phosphopentomutase 




991 


H^nv^vrihn^e-tihosohate aldolase 


SAG2071 


400 


Na+ dependent nucleoside transporter 


SAG2072 


259 


uridine phosphorylase 


SAG2073 


245 


transcriptional regulator, GntR family 


SAG2074 


540 


60 kda chaperonin 


SAG2075 


94 


chaperoning 10 kDa 


1 SAG2076 


267 


ABC transporter, ATP-binding protein 
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SAG2077 


298 


ABC transporter, permease protein 


SAG2078 


320 


protein of unknown function/lipoprotein, putative 


SAG2079 


265 


hydrolase, haloacid dehalogenase-like family 


SAG2080 


286 


glyoxalase family protein 


SAG2081 


243 


conserved hypothetical protein i 


SAG2082 


205 


anaerobic ribonucleoside-triphosphate reductase activating protein 


SAG2083 


163 


acetyltransferase, GNAT family 


SAG2084 


310 


virulence factor MviM, putative 


SAG2085 


47 


conserved hypothetical protein 


SAG2086 


723 


anaerobic ribonucleoside-triphosphate reductase 


SAG2087 


495 


membrane protein, putative 


SAG2088 


40 


hypothetical protein 


SAG2089 


105 


conserved hypothetical protein 


SAG2090 


136 


conserved hypothetical protein TIGR00250 


SAG2091 


88 


conserved hypothetical protein 


SAG2092 


132 


conserved hypothetical protein 


SAG2093 


379 


recA protein 


SAG2094 


NA 


competence/damage-inducible protein CinA, authentic firameshift 


SAG2095 


183 


DNA-3-methyladenine glycosylase I 


SAG2096 


196 


Holliday junction DNA helicase RuvA 


SAG2097 


418 


transporter, putative 


SAG2098 


659 


DNA mismatch repair protein HexB 


SAG2099 


33 


hypothetical protein 


SAG2100 


67 


cold shock protein, CSD family 


SAG2101 


858 


DNA mismatch repair protein HexA 


SAG2102 


145 


arginine repressor ArgR, putative 


SAG2103 


563 


arginyl-tRNA synthetase 


SAG2104 


102 


conserved hypothetical protein 


SAG2105 


290 


conserved hypothetical protein 


SAG2106 


314 


conserved hypothetical protein 


SAG2107 


583 


aspartyl-tRNA synthetase 


SAG2108 


426 


histidyl-tRNA synthetase 


SAG2109 


60 


ribosomal protein L32 


SAG2110 


49 


ribosomal protein L33 


SAG2111 


173 


conserved hypothetical protein 


SAG2112 


494 


site-specific recombinase, phage integrase family 


SAG2113 


82 


conserved hypothetical protein 


SAG2114 


342 


conserved hypothetical protein 


SAG2115 


143 


hypothetical protein 


SAG2116 


151 


conserved hypothetical protein 


SAG2117 


71 


hypothetical protein 


SAG2118 


306 


transcriptional regulator, Cro/CI family 


SAG2119 


373 


conserved domain protein 


SAG2120 


269 


hypothetical protein 


SAG2121 


223 


hypothetical protein 


SAG2122 


223 


DNA-binding response regulator 


SAG2123 


454 


sensor histidine kinase 


SAG2124 


517 


membrane protein, putative 
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SAG2125 


308 


carbamate kinase 


SAG2126 


<<h <«« 
332 


ornithine carbamoyltransterase 


SAG2127 


431 


sensor hishdine kinase 


SAG2128 


277 


response regulator 


SAG2129 


240 


amino acid ABC transporter, ATP-binding protein 


SAG2130 


504 


amino acid ABC transporter, ammo acid-bmding protein/permease 
protein 


SAG2131 


847 


membrane protein, putative 


SAG2132 


247 


conserved hypothetical protein 


SAG2133 


118 


conserved hypothetical protein 


SAG2134 


772 


membrane protein, putative 


SAG2135 


179 


transcnptional regulator, TetR family, putative 


SAG2136 


98 


conserved hypothetical protein 


SAG2137 


203 


ribosomal protein S4 


SAG2138 


95 


conserved hypothetical protein 


SAG2139 


451 


replicative DNA helicase 


SAG2140 


150 


ribosomal protein L9 


SAG2141 


660 


DHH family protein 


SAG2142 


613 


glucose inhibited division protein A 


SAG2143 


203 


membrane protein, putative 


SAG2144 


373 


tRNA (5-methylaminomethyl-2-thiouridylate^ j 


SAG2145 


222 


L-senne dehydratase, iron-sulfur-dependent, beta subumt 


SAG2146 


290 


L-serine dehydratase, iron-sulfur-dependent, alpha subunit 


SAG2147 


234 


protein of unknown function/lipoprotein, putative 


SAG2148 


179 


LysM domain protein 


SAG2149 


264 


cobalt transport family protein 


SAG2150 


280 


ABC transporter, ATP-binding protein 


SAG2151 


279 


ABC transporter, ATP-binding protein 


rf-A A ****** 4 ^ 

SAG2152 


180 


CDP-diacyigiycerol— glycerol-3-phosphate 3- 
phosphatidyltransferase 


SAG2153 


427 


peptidase, Ml 6 family 


SAG2154 


A % A 

414 


conserved hypothetical protein 


SAG2155 


117 


conserved hypothetical protein 


C* A /~*0 1 C/C 

SAG2156 


*i^A 

369 


T? ^A^*_ 

recF protein 


SAG2157 


278 


transporter, putative 


O A iTO ICO 

SAG2155 


OOA 

220 


transcriptional regulator, Cro/CI family 




4yj 


inosine-5 1 -monophosphate dehydrogenase 


O A OOI «CA 

bAQjzloU 


lol 


transcriptional regulator, ArgR family 




OOiC 


transcriptional regulator, Crp/Fnr family 




ZJ4 


conserved hypothetical protein 


g a no 1 £.1 


vf 1 A 
41U 


arginine deiminase 






ar^tvltr^ncfprsicf* lrNf AT* *finYl41v 


SAG2165 


337 


ornithine carbamoyltransferase 


SAG2166 


475 


arginine/omithine antiporter 


SAG2167 


i 318 


carbamate kinase 


SAG2168 


341 


tryptophanyl-tRNA synthetase 


SAG2169 


230 


membrane protein, putative 


SAG2170 


290 


conserved hypothetical protein 
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SAG2171 


539 


ABC transporter, ATP-binding protein 


SAG2172 


859 


ABC transporter, permease protein, putative 


SAG2173 


159 


conserved hypothetical protein TIGR00246 


SAG2174 


409 


serine protease 


SAG2175 


257 


partitioning protein, ParB family 
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Table 2 



s 

ORF < 


►ize S 
aa) P 


ignal S 
eptide 


ortase 
motif p 


Lipo- 
rotcin C 


V 

)lhcr 


Western 
blot F 


ACS s 


GBS 
pecific A 


nnotation 


SAG0017 


447 


+ ; 












P 


csB 


SAG0031 


299 


+ 












P 


eptidase, M23/M37 Family 


SAG0032 


434 


+ 








+ 


+ 


g 


roup B streptococcal surface immunogenic protein 


SAG0034 


438 


+ 




+ 




+ 




s 


ugar ABC transporter, sugar-binding protein 


SAG0051 


126 










+ 


+ 


h 


40RN motif family protein 


SAG0079 


212 








+ 


+ 


+ 


a 


denylate kinase 


SAG0086 


85 






+ 








+ I 


ipoprotein, putative 


SAG0093 


250 


+ 








+ 


+ 


I 


>alanyl-D-alanine carboxypeptidase family protein 


SAG0094 


191 


+ 












1 


- _ m mm * 1 A A 

sl-acetylmuramoyl-L-alanme amidase, family 4 protein 


SAG0108 


308 


+ 














conserved hypothetcal protein 


SAG0114 


322 


+ 




+ 










nbose ABC transporter, pertplasmic D-nbose-binding 
protein 


SAG0124 


356 


+ 














sensor histidine kinase 


SAG0132 


294 


+ 










+ 




SPFH domain/Band 7 family protein 


SAG0134 


96 


+ 












+ 


hypothetical protein 


SAG0146 


395 


+ 














penicillin-binding protein 4, putative 


SAGO 147 


411 


+ 














D-alanyl-D-alanmc carboxypeptidase family protein { 


SAG0148 


551 






+ 




+ 


* 




oligopeptide ABC transporter, substrate-binding protein, 
putative 


SAG0166 


123 


+ 














conserved domain protein 


SAG0176 


94 
















conserved hypothetcal protein 


SAG0187 


542 


I + 




+ 




+ 


+ 




oligopeptide ABC transporter, oligopeptide-binding 
protein 


SAG0206 


6< 


) 




+ 








+ 


lipoprotein] putative 


SAG 02 13 


3! 


} + 












+ 


hypothetical protein 


SAG0231 


13 


s + 














hypothetical protein 


SAG0242 


30 


8 




+ 










ammo acio /vol* transpoixcr, ammo uciu-oinuing piutcm 




1 j 














+ 


protein of unknown fimction/l ipoprotein, putative 


SAG0255 


31 


5 + 














conserved hypothetcal protein 


SAG025*3 


r 5 


3 




+ 








+ 


lipoprotein, putative 


SAG026f 


; 23 


5 + 








+ 






conserved hypothetcal protein 


SAG029( 


) 27 


0 + 








+ 






ABC transporter, substrate-binding protein 


SAG029J 


J 75 


0 + 














penicillin-binding protein 1 A 



1 




Table 2 



s 

ORF ( 


>ize S 
aa) P 


ignal S 
rptidc 


ortasel 1 
motif 1 p 


Jpo- 

lutein i. 


\ 

tther 


Vcstcrn 

Kin* P 

oioi r 


APS * 


GBS 


n notation 


SAG0306 


535 


4* 












K 


H domain nrotein 


SAG0321 


339 


4* 












Si 
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conserved hypothetcal protein 


SAG0604 


23 
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prophage LambdaSal, lysin, putative 


SAG0611 


' 43 
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sensor histidine kinase VncS 


SAG0624 


I 57 


4 + 














septation nng formation regulator EzrA, putative 
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> 35 


4 + 














conserved domain protein 


SAG063! 
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5 + 








+ 






acid phosphatase, class B 
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I Size Signal 
ORF I (aa) Peptide 



Sortase Lipo- | | Western 
motif I protein I Other I blot 



FACS 



SAG0645 554 



GBS 
specific | 



notation 



[cell wall surface anchor family protein 



SAG0646 307 + 



ell wall surface anchor family protein 



SAGQ647 305 + 



ortase family protein 



SAG0649 890 



(cell wall surface anchor family protein, putative 



SAG0658 383 + 



lipoprotein, putative 



SAG0675 171 + 



putative secreted protein 



SAG0676 885 



proteinase, putative 



SAG0677 1062 



hypothetical protein 



SAG0679 3431 



otein of unknown fimcUon 



SAG0680 339 



protein of unknown function 



SAG0681 353 + 



onserved domain protein 



SAG0686 261 + 



SAG0714 188 + 



>NA-entry nuclease, putative 



[conserved hypothetical protein 

nino acid ABC transporter, amino acid-binding protein 



SAG0717 266 



SAG0720 449 



(sensory box histidine kinase 



SAG0738 132 



(conserved hypothetical protein 



SAG0739 143 + 



onserved hypothetcal protein 



SAG0742 428 



eptidase, U32 family 



SAG0755 282 



peptidase, U32 family 



SAG0757 129 



protein of unknown function/lipoprotein, putative 



SAG0764 230 



SAG0765 681 + 



SAG0771 512 + 



SAG0776 276 + 



SAG0777 528 



SAG0785 330 + 



SAG0808 309 + 



+ + 



phoglycerate mutase family protein 



;illin-binding protein 2b 



~ + cell wall surface anchor family protein 



faeC family protein, putative 



+ I + 
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protease maturation protein, putative 



SAG0824 417 + 



polysaccharide deacetylase family protem 



SAG0832 753 + 



protein of unknown function 



SAG0833 181 + 



hypothetical proteui 



SAG0867 63 + 



Iconserved hypothetcal protein 



SAG0868 285 + 



ONA-entry nucteasc 



SAG0886 319 + 



protein of unknown function 
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R5 protein 
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SAG1350 
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surface antigen-related protein 
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conserved hypothetcal protein 


SAGI371 


392 


I + 
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SAG1393 
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iron compound ABC transporter, substrate-binding protein 


SAG1404 
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+ 
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cell wall surface anchor family protein 


SAG 1405 
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sortase family protein ^ 
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lipoprotein, putative 
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amino acid ABC transporter, amino acid -binding protein 
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I Size I Signal I Sort ase 
ORF I (aa) [Peptide! motif 
lsAG1462 970 



Lipo- I | Western 
protein I Other I blot 



I SAG 1473 192 



GBS 
FACSl specific 



Annotation 



cell wall surface anchor family protein 



cell wall surface anchor family protein 



I SAG1474 I 680 + 



amidase family protein 



[SAG1483 78 + 



I SAG 1488 195 + 



rSAG!491 530 



I SAG 1508 I 590 



SAG1518 538 + 



preproteintranslocase, SecG subunit 



dephospho-CoA kinase 



+ + 



hypothetical protein 



67 kDa Myosm-crossreactive streptococcal antigen 



peptide ABC transporter, pepttdc-binding protein 



[SAG1530 267 + 



SAG1533 308 + 



fSAGl544 232 + 



SAGI551 67 + 



ISAG1552 719 + 



peptidyl-prolyt cis-trans isomerase, cyclophflin-type 



manganese ABC transporter, manganese-binding adhesion 
liprotein 



gluconate 5-dehydrogenase, putaUve 



hypothetical protein 



conserved hypothetical protein 



SAG1553 477 + 



ISAG1562 280 + 



SAG1582 388 + 



SAG1590 449 



hypothetical protein 



conserved hypothetcal protein 



branched-chain amino acid ABC transporter, ammo acid- 
binding protein 



potassium uptake protein, Trk family 



SAG1601 | 



SAG1610 285 



conserved hypothetical protein 



amino acid ABC transporter, substrate-binding protein 



ISAG1618 1032 



SAG1624 501 + 



Snf2 family protein 



sensor histidine kinase CsrS 



SAG1628 18 



SAG1631 223 + 



SAG1641 274 



[SAG1642 277 + 



SAG 1683 512 



[SAG1706 238 



lemA protein 



potassium uptake protein, Trk family, putative 



YaeC family protein 



ABC transporter, substrate-binding protein 



immunogenic secreted protein, putative 



conserved hypothetical protein 



SAG1745 148 + 



SAG1752 390 + 



SAG 1759 



SAG1762 



230 



169 



hypothetical protein 



conserved hypothetcal protein T1GR00275 



protein of unknown function 



conserved hypothetical protein 
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SAG1786 130 
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FACS 



SAG1822 272 + 



SAG1823 418 



SAG1837 468 
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SAG1842 1224 
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acid phosphatase 



glyceraldehyde 3-phosphate dehydrogenase 



conserved hypothetical protein 



protein of unknown function 



dltD protein 



sensor histidme kinase 



protein of unknown function 



protein of unknown function 



prophage LambdaSa2, lysin, putative 



prophage LambdaSa2, holin, putative 



conserved hypothetical protein 



prophage LambdaSa2 > PblB, putative 



N-acetylrouramoyl-L-alamne amidase, family 4 protein 



sensor histidine kinase 



neurammidase-related protein 



adhesion lipoprotein 



[SAG 1941 800 + 



3*-cychc-nucleotide 2 % -phosphodiesterase 



SAG1945 345 + 



iron ABC transporter, iron-binding protein 



SAG1947 549 



SAG1960 551 



I SAG 1966 I 293 



|SAG1996 263 + 



conserved hypothetical protein 



+ 



sensor histidme kinase 



hemolysin precursor, putauve 



cell wall surface anchor family protein, putative 



SAG1997 182 + 



SAG1998 457 



[SAG2021 826 



ISAG2043 255 + 



hypothetical protein 



hypothetical protein 



cell wall surface anchor family protein 



cAMP factor 



ISAG2053 1570 + 



serine protease, subtilase family, putative 



SAG2055 46 



sensor histidine kinase 



SAG2056 20 



chromosome assembly-related protem 



SAG2063 630 + 



pathogenicity protein, putative 



SAG2078 320 + 



SAG2094 



protein of unknown function/lipoprotcin, putative 



competence/damage-inducible protein CmA, authentic 
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sensor histidine kinase 


SA02I41 


660 


+ 








+ 






DHH family protein 


SAG2147 
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+ 




+ 




+ 


+ 




protein of unknown function/1 ipoprotein, putative 


SAG2148 


179 
















LysM domain protein 


SAG2174 


409 


+ 














serine protease 


SAG0013 


428 


+ 








+ 




jprotein of unknown function 
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SAG0038 


conserved hypothetical protein 


SAG0048 


transcriptional regulator Cro/CI family 


SAG0091 


transcriptional regulator ComXl putative 


SAG0137 


conserved hypothetical protein 


SAG0686 


DNA-entry nuclease putative 


SAG0770 


membrane protein putative 


SAG0868 


DNA-entry nuclease 


SAG1 143 


conserved hypothetical protein 


SAG1233 


streptococcal histidine triad family protein 


SAG1596 


integrase/recombinase phage integrase family 


SAG1616 


conserved hypothetical protein 


SAG1721 


conserved hypothetical protein. 
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Table 6 



Cluster 1 

SAG0230 

SAG0231 

SAG0232 

SAG0233 

SAG0234 

SAG0235 



conserved hypothetical protein 
hypothetical protein 
hypothetical protein 
hypothetical protein 
hypothetical protein 
hypothetical protein 



Cluster 2 

SAGQ222 conserved domain protein 

SAG0223 conserved hypothetical protein, fusion 

S AG0225 hypothetical protein 

SAG0226 recombination protein 

SAG0227 hypothetical protein 

SAGQ228 conserved hypothetical protein 

SAGQ229 conserved hypothetical protein 



Cluster 3 

SAG0634 

SAG0635 

SAG0636 

SAG0638 

SAGG640 



hypothetical protein 

acid phosphatase, class B 

conserved hypothetical protein 

cell wall surface anchor family protein, interruption-N 

transposase OrfA, IS3 family 



1 



SAG0642 
SAG0643 
SAG0644 
SAG0645 
SAG0646 
SAG0647 
SAG0648 
. SAG0649 
SAG0650 
SAG0651 

Cluster 4 

SAG1898 
SAG1899 
SAG1900 
SAG1901 
SAG1902 
SAG 1905 
SAG1906 



Table 6 

hypothetical protein 
chaperonin, 33 kDa, degenerate 
transcriptional regulator, AraC family 
cell wall surface anchor family protein 
cell wall surface anchor family protein 
sortase family protein 
sortase family protein 

cell wall surface anchor family protein, putative 
sortase family protein 
protein of unknown function 



PTS system, IED component 
PTS system, EEC component 
PTS system, HB component 
glucuronyl hydrolase 
PTS system, HA component 
conserved hypothetical protein 
carbohydrate kinase, PfkB family 



Cluster 5 

SAG0247 
SAG0248 



hypothetical protein 
hypothetical protein 



It HQ'S U e£ 



SAG0249 

SAG0674 

SAG0675 

SAG0676 

SAG0677 

SAG0680 

SAG0681 

SAG0684 

SAG1698 

Cluster 6 

SAG0261 
SAG0262 
SAG0965 
SAG0966 
SAG2002 

Cluster 7 

SAG1027 
SAG1028 
SAG 1029 
SAG1030 
SAG1031 



Table 6 

hypothetical protein 

hypothetical protein 

putative secreted protein 

proteinase, putative 

hypothetical protein 

protein of unknown function 

conserved domain protein 

ABC transporter, ATP-binding protein 

conserved hypothetical protein 



IS 1381, transposase OrfB 
IS1381, transposase OrfA 
IS 1381, transposase OrfA 
IS1381, transposase OrfB 
IS1381, transposase OrfB 



conserved hypothetical protein 
hypothetical protein 
hypothetical protein 
protein of unknown function 
conserved domain protein 
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SAG 1032 conserved hypothetical protein 



Cluster 8 

SAG 1253 
SAG 1254 
SAG1255 
SAG2022 
SAG2023 
SAG2024 



transposase, ISL3 family 
mercuric reductase 

mercuric resistance operon regulatory protein MerR 
transposase, ISL3 family 
mercuric reductase 

mercuric resistance operon regulatoiy protein MerR 



Cluster 9 

SAG 1993 site-specific recombinase, phage integrase family 

SAG1994 conserved hypothetical protein 

SAG1995 hypothetical protein 

SAG 1996 cell wall surface anchor family protein, putative 

SAG 1997 hypothetical protein 

SAG1998 hypothetical protein 

SAG2000 membrane protein, putative 

SAG2001 conjugal transfer protein, interruption-C 

S AG2007 conserved hypothetical protein 

S AG2008 conserved hypothetical protein 

S AG2009 conserved hypothetical protein 

SAG2010 hypothetical protein 



4 



SAG2011 

SAG2012 

SAG2016 

SAG2017 

SAG2025 

Cluster 10 

SAG1039 

SAG1447 

SAG1448 

SAG1449 

SAG1450 

SAG1452 

SAG1453 

SAG1454 

SAG1455 

SAG1456 

SAG1459 

SAG1460 

SAG1461 

SAG1462 

SAG1463 

SAG1469 




Table 6 

conserved hypothetical protein 

hypothetical protein 

hypothetical protein 

transcriptional regulator, Cro/CI family 

Mn2+/Fe2+ transporter, NRAMP family 



conserved hypothetical protein 
conserved hypothetical protein 
glycosyl transferase, group 1 family protein 
preprotein translocase SecA subunit, putative 
conserved domain protein 
conserved hypothetical protein 
preprotein translocase SecY family protein 
glycosyl transferase, putative 

glycosyl transferase, group 2 family protein 

glycosyl transferase, family 8, degenerate 

glycosyl transferase family 8 

glycosyl transferase, family 8 

conserved hypothetical protein 

cell wall surface anchor family protein 

transcriptional regulator, RofA family, authentic point mutation 
conserved hypothetical protein 



5 



SAG1471 
SAG1933 

Cluster 11 

SAG0009 

SAG0120 

SAG0157 

SAG0186 

SAG0216 

SAG0236 

SAG0307 

SAG0308 

SAG0311 

SAG0518 

SAG0553 

SAG0555 

SAG0564 

SAG0579 

SAG0580 

SAG061 1 

SAG0637 

SAG0641 

SAG0652 



Table 6 

conserved hypothetical protein 

PTS system, I1C component, putative 



hypothetical protein 
hypothetical protein 

deoxyribonuclease-related protein, degenerate 

hypothetical protein 

hypothetical protein 

hypothetical protein 

hypothetical protein 

ABC transporter, ATP-binding protein 

DNA-binding response regulator, authentic point mutation 

peptide chain release factor 2, programmed frameshift 

hypothetical protein 

prophage LambdaSal, antirepressor, putative 
conserved hypothetical protein 
conserved hypothetical protein 
conserved hypothetical protein, truncation 
transposase, degenerate 

transcriptional regulator, TetR family, putative, authentic frameshift 
Tn5252, Orf 10 protein, degenerate 
Tn5252, Orf 28 protein, degenerate 
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Table 6 



SAG0655 


conserved hypothetical protein 


SAG0678 


endopeptidase O, degenerate 


SAG0683 


transmembrane protein Vexp3, putative, degenerate 


SAG0855 


glycogen biosynthesis protein GlgD, authentic frameshift 


SAG0898 


hypothetical protein 


SAG0899 


hypothetical protein 


SAG0901 


hypothetical protein 

J Mr -r 


SAG0902 


hypothetical protein 


SAG0903 


hypothetical protein 


SAG0917 


Tn916, hypothetical protein 


SAG0920 


Tn916, hypothetical protein 


SAG0922 


Tn9 16, hypothetical protein 


SAG0924 


Tn916, tetM leader peptide 


SAG0928 


Tn916, hypothetical protein, authentic frameshift 


SAG0936 


Tn916, hypothetical protein 


SAG0943 


hypothetical protein 


SAG0972 


conserved hypothetical protein, authentic frameshift 




hvnnthetical nrotein 


SAG1080 


hypothetical protein 


SAG1123 


hypothetical protein 


SAG1129 


hypothetical protein 


SAG1136 


conserved hypothetical protein 


SAG1217 


conserved hypothetical protein, authentic frameshift 
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Table 6 



SAG 1231 


transposase OrfB, IS3 family, degenerate 


SAG 1242 


transposase OrfB, IS3 family, truncation 


SAG1309 


hypothetical protein 


SAG 1331 


R5 protein 


SAG 1437 


hypothetical protein 


SAG 1445 


MutT/nudix family protein, authentic frameshift 


SAG 1484 


ribosomal protein L33 


SAG 1493 


hypothetical protein 


SAG 1539 


hypothetical protein 


SAG 1543 


conserved hypothetical protein, authentic frameshift 


SAG1560 


hypothetical protein 


SAG1568 


nhosphoserine aminotransferase, authentic frameshift 


SAG1570 


conserved hypothetical protein 


SAOloOI 


COnscrVCu uypiJiiiciitfeii jpivusm 


SAG 1644 


hypothetical protein 


SAG1646 


hypothetical protein 


SAG1699 


hypothetical protein 


SAG1705 


peptidase, M24 family, authentic point mutation 


SAG1708 


hypothetical protein 



SAG1857 prophage LambdaSa2, HNH endonuclease family protein 
SAG1864 hypothetical protein 
SAG 1868 hypothetical protein 
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Table 6 



SAG 1869 prophage LambdaSa2, type n DNA modification methyitransferase, 
putative 

SAG1872 hypothetical protein 

SAG 1 874 hypothetical protein 

SAG1876 prophage LambdaSa2, HNH endonuclease family protein 

SAG 1 878 conserved domain protein 

SAG 1881 hypothetical protein 

SAG1883 conserved hypothetical protein 

SAG 1886 hypothetical protein 

SAG1903 hypothetical protein 

SAG1937 streptococcal histidine triad family protein, degenerate 

SAG1971 . hypothetical protein 

SAG 1979 membrane protein, putative 

SAG 1980 ABC transporter, ATP-binding protein 

SAG1981 hypothetical protein 

SAG 1982 transcriptional regulator, Cro/CI family 

SAG 1983 conserved hypothetical protein 

SAG 1984 conserved hypothetical protein TIGR00730 

SAG1985 hypothetical protein 

SAG1991 transcriptional regulator, Cro/CI family 

SAG 1992 protein of unknown fanction 

SAG 1999 hypothetical protein 

SAG2004 conjugal transfer protein, interruption-N 
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Table 6 

SAG2039 conserved hypothetical protein 
S AG2044 hypothetical protein 
SAG2052 hypothetical protein 
SAG2065 ribosomal protein L33 

SAG2094 > competence/damage-inducible protein CinA, authentic frameshift 
SAG2099 hypothetical protein 



Cluster 12 

SAG1164 

SAG1165 

SAG1166 

SAG1167 

SAG1168 



glycosyl transferase CpsJ(V) 
glycosyl transferase CpsO(V) 
glycosyl transferase GpsN(V) 
polysaccharide biosynthesis protein CpsM(V) 
polysaccharide biosynthesis protein cpsH(V) 



Cluster 13 

S AG058 1 conserved hypothetical protein 

SAG0582 conserved hypothetical protein 

SAG0583 conserved hypothetical protein 

SAG0585 conserved hypothetical protein 

SAG0586 conserved hypothetical protein 

SAG0587 prophage LambdaSal, structural protein, putative 

SAG0588 conserved hypothetical protein 

SAG0589 conserved hypothetical protein 
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Table 6 



SAG0590 conserved hypothetical protein 

SAG0591 conserved hypothetical protein 

SAG0593 prophage LambdaSal, structural protein 

SAG0594 conserved hypothetical protein 

SAG0595 conserved hypothetical protein 

S AG0596 prophage LambdaSal , pblA protein, internal deletion 



Cluster 14 

SAG0915 Tn916, transposase 

SAG091 8 Tn916, hypothetical protein 

SAG0919 Tn916, hypothetical protein 

SAG092 1 Tn9 16, transcriptional regulator, putative 

SAG0925 Tn916, hypothetical protein 

SAG0926 Tn916, NLP/P60 family protein 

S AG0927 membrane protein, putative 

SAG0929 Tn916, hypothetical protein 

SAG0930 Tn916, hypothetical protein 

SAG093 1 Tn916, hypothetical protein 

SAG0932 Tn916, transcriptional regulator, putative 

SAG0933 Tn916, FtsK/SpoIIIE family protein 
SAG0934 Tn916, hypothetical protein 
SAG0935 Tn916, hypothetical protein 

SAG0937 ABC transporter, ATP-binding protein, authentic frameshift 
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Table 6 



Cluster 15 

SAG 1835 conserved hypothetical protein 

SAG1837 prophage LambdaSa2, lysin, putative 

SAG1839 conserved hypothetical protein 

SAG1840 hypothetical protein 

SAG1842 prophage LambdaSa2, PblB, putative 

SAG1843 conserved hypothetical protein 

SAG1844 conserved hypothetical protein 

SAG 1 849 hypothetical protein 

SAG 185 1 conserved domain protein 

SAG 1852 conserved domain protein 

SAG 1853 prophage LambdaSa2, protease, putative 

SAG1854 conserved hypothetical protein 

SAG1855 prophage LambdaSa2, terminase large subunit, putative 
SAG1856 hypothetical protein 
SAG1858 hypothetical protein 

SAG1859 prophage LambdaSa2, site-specific recombinase, phage integrase family 
SAG 1860 conserved hypothetical protein 

SAG1861 prophage LambdaSa2, transcriptional regulator, Cro/CI femily 
SAG1862 hypothetical protein 

SAG1863 prophage LambdaSa2, single-strand binding protein 
SAG 1865 conserved hypothetical protein 
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Table 6 

SAG1866 conserved hypothetical protein 

SAG 1867 conserved hypothetical protein 

SAG 1870 prophage LambdaSa2, DNA replication protein DnaC, putative 

SAG1871 prophage LambdaSa2, bacteriophage replication protein/hypothetical 
protein, truncation/fusion 

SAG 1873 prophage LambdaSa2, replicative DNA helicase 

SAG1877 prophage LambdaSa2, antirepressor protein, putative 

SAG1879 hypothetical protein 

SAG 1882 prophage LambdaSa2, repressor protein, putative 

SAG 1 884 hypothetical protein 

SAG 1885 prophage LambdaSa2, site-specific recombinase, phage integrase family 
Cluster 16 

SAG 1247 site-specific recombinase, phage integrase family 

SAG1250 Tn5252, relaxase 

SAG 125 1 Tn5252, Orf 9 protein 

SAG 1252 Tn5252, Orf 10 protein 

S AG1256 IS86 1 , transposase OrfB, truncation 

SAG1257 cation-transporting ATPase, E1-E2 family 

SAG1258 cadmium efflux system accessory protein 

SAG1259 conserved hypothetical protein 

SAG 1260 hypothetical protein 

SAG1261 conserved hypothetical protein 
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Table 6 

SAG1262 cation-transporting ATPase, E1-E2 family 

SAG1263 conserved domain protein, authentic firameshift 

SAG 1264 transcriptional repressor CopY, putative 

SAG 1265 cadmium resistance transporter, putative 

SAG 1266 hypothetical protein 

SAG 1267 hypothetical protein 

SAG 1268 repressor protein, putative 

SAG 1270 ImpB/MucB/SamB family protein 

SAG 127 1 conserved hypothetical protein 

SAG 1272 conserved hypothetical protein 

SAG1273 conserved hypothetical protein 

SAG1274 conserved hypothetical protein 

SAG 1276 conserved hypothetical protein 

SAG 1277 hypothetical protein 

SAG1278 hypothetical protein 

SAG 1279 conserved domain protein 

SAG1280 SNF2 family protein 

SAG1281 hypothetical protein 

SAG 1283 agglutinin receptor 

SAG 1284 abortive infection protein AbiGI 

SAG 1285 abortive infection protein AbiGII 

SAG1286 Tn5252, Orf28 

SAG 1287 Tn5252,Orf26 
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Table 6 

SAG 1288 Tn5252, Orf25, degenerate 

SAG1289 Tn5252, Orf23 

SAG 1290 hypothetical protein 

SAG1291 Tn5252, Orf 21 protein, internal deletion 

SAG1292 hypothetical protein 

SAG 1293 protease, putative 

SAG 1294 conserved hypothetical protein 

SAG1295 conserved hypothetical protein 

SAG1296 conserved hypothetical protein 

SAG 1297 C-5 cytosine-specific DNA methylase 

SAG 1299 conserved hypothetical protein 

SAG1304 hypothetical protein 
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Table 7 



Locus 

Housekeeping 
SAG0466 
SAG0471 
SAG0492 
SAG0767 
SAG 1086 
SAG 1600 
SAG1680 
SAG 1723 



Annotation 

thiolase 
glucokinase 

amino acid ABC transporter, ATP-binding protein 
D-alanine— D-alanine ligase 
xanthine phosphoribosyllransferase 
glutamate racemase 
shikimate 5-dehydrogenase 
signal peptidase I 



Surface-exposed 




SAG0079 


adenylate kinase 


SAG0093 


D-alanyl-D-alanine carboxypeptidase family protein 


SAG0163 


competence protein CglA 


SAG0290 


ABC transporter, substrate-binding protein 


SAG0368 


protein of unknown function 


SAG0503 


lipase/acylhydrolase 


SAG1473 


cell wall surface anchor family protein 


SAG1552 


conserved hypothetical protein 


SAG1641 


YaeC family protein 


SAG2147 


protein of unknown function/lipoprotein, putative 


SAG2148 


LysM domain protein 



% 
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Table 8: GBS genes shared with GAS and pn umococcus 



ORFxxxxx Ann tatlon 



ORF000Q3 PcsB protein (pscB) 



ORFQQQ04 ribose-phosphate pyrophosphokinase (prsA) 



ORF00005 aminotransferase, class 1 



QRFQ0006 recombination protein O 



ORF00009 fatty acid/phospholipid synthesis protein PisX (plsX) 



QRF00011 phosphoribosvlaminoimidazole-succinocarboxamide synthase (purC) 



ORF00012 phosphoribosylformylglycinamidine synthase, putative 



ORF00013 amidophosphoribosyltransferase (purF) 



Urvruuu i«J aiiuww^i iw^^. .w, j ^' 

QRF00014 p hosphoribosylformylglycinamidine cyclo-ligase (purM) 

- , . ' 1 ._«.._« t-1 — f--^. .H M .w«f«wMnw /rii irM\ 
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ORF00015 phosphoribosvlglycinamide formyltransferase (purN) 



ORF00020 group B streptococcal surface immunogenic protein 



ORF00021 N-acetylmannosamine-6-P epimerase, putative 



ORF00022 sugar ABC transporter, sugar-binding protein 



ORF00023 sugar ABC transporter, permease protein 



QRF00024 sugar ABC transporter, permease protein 



ORF00Q26 conserved hypothetical protein 



ORF00027 N-acetylneuraminate lyase, putative 



ORF00028 expressed ROK family protein 



QRFQ0030 phosphosugar-binding transcriptional regulator. RpiR f amily, putative 



I ORF00031 phosphoribosylamine-glvcine ligase (purD) 

1 ORF00032 phosphoribosylaminoimidazole carboxylas e, catalytic subuntt (purE) 
loRF00033 phosphoribosviaminoimidazole carboxylase, ATPas esubunit (purK) 

|ORF00036 adenylosuccinate lyase (purB) 

ORF00037 transcriptional regulator, Cro/CI family 

|ORF00038 Holliday junction DNA helicase RuvB (ruvB) 



ORF00039 phosphotyrosine protein phosphatase, low molecular weight 



IORF00040 MORN motif family protein 



ORF00041 membrane protein, putative 



ORF00043 alcohol dehydrogenase, propanol-preferring (adhP) 



ORF00045 MATE efflux family protein 



ORF00046 ribosomal protein S10 (rpsJ) 



IORF00047 ribosomal protein L3 (rplC) 



loRF00048 ribosomal protein L4 (rplD) 
ORF00049 ribosomal protein L23 (rplW) 



1ORF00050 ribosomal protein L2 (rplB) 
ORF00052 ribosomal protein S19 (rpsS) 



loRF00054 ribosomal protein L22 (rpIV) 



IORFQQ055 ribosomal protein S3 (rpsC) 



ORF00056 ribosomal protein L16 (rplP) 



pRF00058 ribosomal protein L29 (rpmC) 



I ORF00059 ribosomal protein S17 (rpsQ) 
pRF00060 ribosomal protein L14 (rpIN) 
|ORF00Q61 ribosomal protein L24 (rplX) 



loRF00063 ribosomal protein L5 (rplE) 



IORF00065 ribosomal protein S8 (rpsH) 



IORF00066 ribosomal protein L6 (rplF) 



ORF00068 ribosomal protein L18 (rpIR) 



|ORF00069 ribosomal protein S5 (rpsE) 



ORF0Q070 ribosomal protein L30 (rpmD) 



IORFQ0071 ribosomal protein L15 (rplO) 



[ORF0Q072 preprotein translocase. SecY subunit 
IORF0007 3 adenylate kinase (adk) 



ORFQ0074 translation initiation factor IF-1 (infA) 
|ORF00075 ribosomal protein L36 (rpmJ) 
ORF00077 ribosomal protein S13 (rpsM) 
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Tabl 8: GBS genes shared with GAS and pneumococ us 



ORFxxxxx Ann tation 



1ORF00078 ribosomal protein S11 (rpsK) 

ORF0Q080 DNA-directed RNA polymerase, alpha subunit (rpoA) 



ORF00093 transcriptional regulator ComX1, putative 



pRF00094 phosphoglycerate mutase family protein 



ORF00097 heat-inducible transcription repressor HrcA (hrcA) 



IORF00098 heat shock protein GrpE (grpE) 
|ORF00099 dnaK protein (dnaK) 



IORF00100 dnaJ protein (dnaJ) 

IORF00101 transcriptional regulator, GntR family 



ORFQ0102 tRNA pseudouridlne synthase A (truA) 



ORF00103 phosphomethylpyrimidine kinase, putative 



1ORF00104 conserved hypothetical protein 



1ORFQQ105 conserved hypothetical protein 
[ORF00106 conserved hypothetical protein" 



IQRFQ0107 trigger factor (tig) 



ORF00108 DNA-directed RNA polymerase, delta subunit, putative 



lORFOOl 09 CTP synthase (py rG) 
iORF00111 deoxyuridine 5* -triphosphate nucleotidohydrolase (dut) 



IORF00113 carbonic anhydrase-related protein 



ORF00115 pyridine nucleotide-disulphide oxidoreductase family protein 



1ORF00116 glutamyltRNA synthetase(g1tX) 



ORFQ0119 ribose ABC transporter, ATP-binding protein (rbsA) 



1 ORF00122 ribose operon repressor RbsR (rbsR) 



ORF00125 ABC transporter, ATP-binding protein 



ORF00126 DNA-binding response regulator 
l ORFOOl 28 sensor histidine kinase 
ORF00131 fructose-bisphosphate aldolase (fbaf 



1ORF00132 L-2-hydroxyisocaproate dehydrogenase 



I ORF00133 ribosomal protein L28 (rpmB) 



1ORF0Q134 conserved hypothetical protein 



IORF00135 DAK2 domain protein 



1QRFQ0136 expressed SPFH domain/Band 7 family protein 



IORF00141 amino acid ABC transporter, ATP-binding protein 



ORF00142 amino acid ABC transporter, amino acid-binding protein/permease protein 
lORF00143 conserved hypothetical protein 



IORF00145 undecaprenol kinase, putative 



ORF00146 negative regulator of competence MecA, putative 



IORF00149 ABC transporter, ATP-binding protein 
ORF001 50 consen/ed hypothetical protein ~~ 
QRF00151 selenocystelne lyase (csdB) 



1 ORF00152 NifU family protein 
1ORF00153 conserved hypothetical protein" 



1ORF00155 Oalanyl-D-alanine carboxypeptidase 



ORF00158 oligopeptide ABC transporter, permease protein 



ORF00160 oligopeptide ABC transporter, ATP-binding protein 



ORF00161 oligopeptide ABC transporter, ATP-binding protein 



[ORF00167 adc operon repressor AdcR (adcR) 



1ORF00168 zinc ABC transporter, ATP-binding protein 



IORF00169 zinc ABC transporter, permease protein 
ORF00172 tyrosyl-tRNA synthetase (tyrS) ~ 



IORF0Q173 penicillin-binding protein 1B, putative 



ORF00174 DNA-directed RNA polymerase, beta subunit (rpoB) 



ORF00176 DNA-directed RNA polymerase beta 1 subunit (rpoC) 
ORF00178 conserved hypothetical protein 



IORF001 79 competenc protein CgIA (cgIA) 
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Table8:GBSg nes shared with GAS and pneum coccus 



ORFxxxxxAnn tation 

1ORF00180 competence protein CgIB (cglB) 



ORpnmfti conserved hypothetical protein 



|ORF00183 conserved hypothetical protein 
loRPnnifi4 acetate kinase (ackA) 



ORF00190 pyrroline-5-carboxylate reductase (proC) 



|ORF00191 glutamyl-aminopeptidase (pepA) 



QRFQ0198 single-strand binding protein (ssb) 



ORF0Q211 PTS system. I1ABC components 
loRF00212 alpha amylase family protein 



ORF00214 transcriptional antiterminator. BglG family 



ORF00219 PTS system. IIC component putative 



IORF00224 ribosomal protein S15 (rpsO) 



ORF00225 polyribonucleotide nucleotidyltransferase (pnp) 



loRF00227 serine O-acetyltransferase (cysE) 



ORF00229 cvsteinyl-tRNA synthetase (cysS) 



1ORF00230 conserved hypothetical protein 



QRFQQ231 RNA methyltransferase, TrmH family, group 3 



I0RFOO233 DegV family protein 



ORF00236 ribosomal protein L13 (rplM) 



[ORF00237 ribosomal protein S9 (rpsl) 



ORF00261 transcriptional regulator MutR family 



1ORF00262 transporterTputative 



ORF00263 amino acid ABC transporter, permease protein 



IORF00264 amino acid ABC transporter, amino aci d-binding protein 
1ORF00265 amino acid ABC transporter, permease protein — 
IQRF0Q266 amino acid ABC transporter, ATP-binding protein 



ORF00295 N-acetylglucosamine-6-phosphate d eacetylase (nagA) 



1ORF00296 conserved hypothetical protein 

1ORF00297 glycyl-tRNA synthetase, aloha subunit (glyQ) 
IbRF00299 glycyl-tRNA synthetase, beta subunit (glyS) 
1ORF00300 conserved hypothetical protein 



1ORF00302 glycerol kinase (glpK) 



1ORF0Q303 alpha-glycerophosphate oxidase 
loRF00304 glycerol uptake facilitator protein (glpF) 
PRF00306 conserved hypothetical protein 
ORF00307 transketolase (tkt) 
1ORF00309 ABC transporter, ATP-binding protein 
[oRF00310 membrane protein, putative 
1ORF0031 3 PTS system, IIBC components 



ORF00314 glutamate 5-kinase (proB) 



fnRFQ03l5"aamma- glutamyl phosphate reductase (proA) 

. . LuL-*.* ■ Tir^Dnnnnrs 



I ORF00316 conserved hypothetical protein T1GR00Q06 
1ORF00318 penicillin-binding protein 2X (pbpX) 



ORF00319 phospho-N-acetyimuramoyl-pentapepti de-transferase (mraY) 
ORF00320 ATP-dependent RNA helicase, DEAD/DEAH box family_ 
ORF00321 ABC transporter, substrate-binding protein 



IORF00322 amino acid ABC transporter, permease protein 



ORF00323 amino acid ABC transporter, ATP-binding protein 



ORF00325 thioredoxin reductase (trxB) 



QRF0Q326 conserved hypothetical protein 



ORF00327 NAD synthetase (nadE) 
ORF00328 aminopeptidase C (pepC) 
ORF00329 penicillin-binding protein 1A (pbpl A) 



QRF0Q330 recombination protein U (recU) 



QRF00331 conserved hypothetical protein 
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Table 8: GBS genes shar d with GAS and pneumococcus 



ORFxxxxx An notati n 



1ORFQ0335 conserved hypothetical protein 



ORF00336 conserved hypothetical protein 



loRF00337 autoinducer-2 production protein Lu xS (luxS) 
ORF00338 KH domain protein 
|ORF00348 guanylate Kinase (gmk) ^ 



QRF0Q349 DNA-directed RNA polymerase, omega subunit. putative 



pRF00350 primosomal protein N' (priA) 



ORF00351 methionyl-tRNA formyltransferase (fmt) 



|ORF00352 Sun protein (sun) 



ORF00353 serine/threonine phosphatase, putative 



ORF00354 serine/threonine protein kinase 



loRF00355 conserved hypothetical protein 



ORF0Q356 sensor histidine kinase, putative 



lnRFn035B DNA-bindin g response regulator m „ — . M 

QRF00359 hydrolase, haloacid dehaloaenase family/peptid yUorolvl cis-trans isomerase, cyaopimwE 

|ORF00360 general stress protein, putative _ . — 

1 * > format e-lyase-activating enzyme (pfIA) _ 



ORF00362 transcriptional regulator. DeoR family 



lORF00363 transcriptional regulator, putative -t— 

ORF00364 PTS system, cellobiose-specific MA component (ceIC) 
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IORF00366 PTS system, cellobiose-specific I1B component (celA) 
1ORF00367 PTS system, cellobiose-specific IIC component (celB) 



ORFQ0368 formate acetyitransferase (pfID) 
ORF00369 transaldolase family protein 



I ORF00371 glycerol dehydrogenase (gldA) 
IORF0037 2 cysteine synthase A (cysK) 



ORF00373 conserved hypothetical protein TIGR00257 



ORF00374 helicase, putative 



ORF00375 competence protein F, putative 



ORF00376 ribosomal subunit interface protein (yfiA) 



ORF00385 enoyl-CoA hydratase/isomerase family protein 



ORF00386 transcriptional regulator. MarR family 



ORF00387 3-oxoacyl-(acyl-carrier-protein) synthase III (fabH) 



ORF0Q388 acyl carrier protein (acpP) 



ORF00390 enoyKacyl-carrier-protein) reductase II (fabK) 



ORF0Q391 malonyl CoA-acyl carrier protein transacytase (fabD) 



ORF00392 3-oxoacyKacyl-carrier protein] reductase (fabG) 



ORF00393 3-oxoacyHacyl-carrier-protein) synthase II (fabF) 



QRFQQ394 acetyl-CoA carboxylase, biotin carboxyl carrier protein (accB) 



ORF00395 (3R)-hydroxymvristoyl-(acyl-carrier-protei n) dehydratase (fabZ) 



ORF00396 acetyl-CoA carboxylase, biotin carboxylase (accC) 



\JT\r\ J\iOWi dL>Ciy»*Uurt wimwjiws.v> — — * 6 

QRFQQ 397 acetyl-CoA carboxylase, carboxvl transferase , beta subunit (accP) 
ORF00398 acetyl-CoA carboxylase, carboxyl transferas e, alpha subunit (accA) 
ORF004Q0 seryl-tRNA synthetase (serS) 

ORF00403 conserved hypothetical protein 

ORF00404 PTS system, mannose-specific lip component 



ORFQ0405 PTS system, mannose-specific IIC component (manM) 
ORF00406 PTS system, mannose-specific IIAB components (manL) 
ORF 00407 hydrolase, haloacid dehalogenase-like family 



ORF00410 xanthine/uracil permease family protein 



QRF00411 conserved hypothetical protein TIGR00150. putative 



ORF00412 acetvltra nsferase. GNAT family 

f . i _ 1 e. 1 ^....^ I 



iORF00413 expressed protein of unknown function 
ORF00415 HIT family protein (hit) 



ORF00419 ABC transporter. ATP-binding prot in 
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Table 8: GBS g nes shar d with GAS and pn umococcus 



ORFxxxxx Annotation 



ORF00421 ABC transporter, permease protein 



ORF00422 conserved hypothetical protein 



ORF00423 conserved hypothetical protein T1GR00091 



ORF00424 conserved hypothetical protein, POINT MUTATION 



ORF00425 N utilization substance protein A (nusA) 



ORF00426 conserved hypothetical protein 



ORF00427 ribosomal protein L7A family 



ORF00428 translation initiation factor 1F-2 



ORF00429 ribosome-binding factor A (rbfA) 



ORF00432 copper-transporter ATPase CopA 

ORF0Q435 hydrolase, haloacid dehalogenase-like family 



ORF00436 DNA polymerase I (polA) 



ORF00437 CoA binding domain protein 



ORF00440 PNA-binding response regulator 



ORF00441 sensor histidine kinase 



ORF00443 queuine tRNA-ribosyltransferase (tgt) 
ORF00444 conserved hypothetical protein 



ORF0Q449 glucose-6-phosphate isomerase (pgi) 



ORF00451 rhomboid family protein 



ORF00452 expressed putative lipoprotein 



ORFQ0453 UTP-glucose-1 -phosphate uridylyltransferase (galU) 
ORF00454 glycerol-3-phosphate dehydrogenase (NAP(P)+) (gpsA) 



ORF00455 ribonuclease P protein component (rnpA) 



ORF00456 SpolllJ family protein 



ORF00458 R3H domain protein 



ORF00463 conserved hypothetical protein 



ORF00464 RecX protein 



ORF0Q465 RNA methyltransferase. TrmA family ... 

ORF00470 ribon ucleoside-diphosphate reductase 2, beta subunit (nrdF) 



ORF00472 ribonucleoside-diphosphate reductase 2 t alpha subunit (nrdE) 



ORF00482 alcohol dehydrogenase, zinc-containing 



ORF00483 oxidoreductase, aldo/keto reductase family 



ORF00484 cation efflux system protein 



ORF00485 transcriptional regulator, TetR family 



ORF00496 conserved hypothetical protein 



ORF00500 acetyltransferase, GNAT family 



ORF00501 conserved hypothetical protein 



ORF00502 valyl-tRNA synthetase (valS) 



ORF0Q508 aspartate-ammonia ligase (asnA) 



ORF0051 1 type ll DNA modification methyltransferase, putative 



ORF00513 phosphopantetheine adenylyltransferase (coaP) 



ORF0Q515 conserved hypothetical protein 



ORF00519 conserved hypothetical protein 



ORF00520 conserved hypothetical protein TIGR00048 



ORF00522 ABC transporter, ATP-binding/permease protein 



ORF00523 ABC transporter, ATP-binding/permease protein 



ORF00524 anthranilate synthase component II (trpG) 



ORF00532 endonuclease HI (nth) 



ORF00534 conserved hypothetical protein 



ORF0Q535 glucokinase (glk) 



ORF00536 expressed protein with rhodanese domain 



ORF00537 elongation factor Tu family protein 



ORF00540 UDP-N-acetyl muramovlalanfne-D-glutamate ligase (murP) — 

ORF00541 UPP-N-acetylgluco samine-N-ac tylmuramyl-(pentapeptide) pyrophosphoryl-undecaprenol N- 



acetylglucosamine transferase (murG) 
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IORF00542 cell division protein DivlB, putative 



1ORF00544 cell division protein FtsA (ftsAr 



lORF00545 cell division protein FtsZ (ftsZ) 



ORF00546 ylmE protein, putative 



ORF00547 ylmF protein (ylmF) 



ORF00549 ylmH protein (ylmH) 



ioRF00550 cell division protein PivlVA, putative 



ORF00552 isoleucyl-tRNA synthetase (ileS) 



IORF00553 conserved hypothetical protein 



ORF00554 MutT/nudix family protein 



ORF00555 ATP-dependent Clp protease, ATP-binding subunit 



{ORF00557 conserved hypothetical protein 



ORFQ0558 amino acid ABC transporter, permease protein 



lORF00559 amino acid ABC transporter, ATP-binding protein 

IORF00560 phosphoglucomutase/phosphomannomutase family protein 



|oRF00562 methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate c vclohydrolase (folD) 
IORF00564 exodeoxyribonuctease VII, large subunit (xseA) 



ORF00566 geranyltranstransferase, putative 



1 ORF00567 hemolysin A 



pRF00570 DNA repair protein RecN (recN) 



pRF00571 expressed DegV family protein 
ORF00574 DNA-binding protein HU (hup) 



ORF00576 dihydroorotate dehydrogenase A (pyrDA) 



1ORF00577 beta-lactam resistance factor (fibB) 



{ORF00578 beta-lactam resistance factor (fibA) 



pRF00579 murM protein, putative 

lORF0058Q hydrolase, haloacid dehalogenase-Uke family 



|ORF00S81 HP domain protein 



1ORF00582 conserved hypothetical protein 



ORF00583 cation-transporting ATPase, E1-E2 family 



PRF00588 cell division ABC transporter, ATP-binding protein FtsE (ftsE) 
1ORF00589 cell division ABC transporter, permease protein FtsX (ftsX) 



|oRF00591 metallo-beta-lactamase superfamily protein 
IORF00593 DNA polymerase 111, epsilon subunit/ATP-dependent helicase DinG 
IQRF0059S aspartate aminotransferase (aspC) 
ORFQ0596 asparaginyl»tRNA synthetase (asnS) 
IORF00601 conserved hypothetical protein 



IORF00602 conserved hypothetical protein 



ORF00603 conserved hypothetical protein 

IORF00605 zinc ABC transporter, zinc-binding adhesion liprotein 



ORF00606 ribosomal protein L31 (rpmE) 



ORF00607 DHH family protein 
IORF00609 flavodoxin 



QRF00614 ribosomal protein L19 (rpIS) 

ORF00640 prophage LambdaSal, single-strand binding protein (ssb) 



loRF00693 DNA-binding response regulator VncR (vncR) 

IORF0Q694 sensor histidine kinase VncS (vncS) 

ORF00699 rod shape-determining protein RodA, putativeG (rodA) 
ORF00700 hydrolase, haloacid dehalogenase-like family 



loRF00701 DNA gyrase, B subunit (gyrB) 



ORF00702 septation ring formation regulator EzrA, putative 



l oRF00705 conserved hypothetical protein 
IORF00706 enolas (eno) ~~ 



QRF00708 3-phosphoshiMmate 1-carboxyvinyltransferase (aroA) 



|ORF00709 shiklmate kinase (aroK) 
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|ORF00710psr protein 



ORF00711 RNA methyltransferase, TrmA family 



{ORF00729 sortase family protein 



ORF00731 sortase family protein 



ORF00734 sortase family protein, FRAMESHIFT 



ORFQ0743 ABC transporter, ATP-binding protein 



IORF00744 membrane protein 



ORF00745 conserved hypothetical protein 
IORF00748 cylG protein (cylG) 



|ORF00776 DNA-entry nuclease, putative 



lo RF00789 2-keto-3-deoxygluconate kinase , , 

ORF00792 2-dehydro-3-deoxyphosphogluconate aldolase/4-hydroxy-2-oxoglutara te aldolase (eda) 

1ORF00798 proline dipeptidase (pepQ) 

ORF00799 transcriptional regulator, RegM family 



QRF0Q802 glycosyl transferase, group 1 family protein 



ORF00803 threonyl-tRNA synthetase (thrS) 
IQRF00804 DNA-binding response regulator 



1ORF00808 amino acid ABC transporter, permease protein 
IORF00810 amino acid ABC transporter, ATP-binding protein 



1ORF00811 DNA-binding response regulator 



ORF00812 sensory box histidine kinase 



ORF0Q813 metallo-beta-lactamase family protein 



IORF0Q815 ribonuclease III (rnc) 



ORF00816 expressed putative chromosome segregation SMC protein 



[ORF00817 hydrolase, haloacid dehalogenase-like family 



1ORF00818 hydrolase, haloacid dehalogenase-like family 
IORF00819 signal recognition particle-docking protein FtsY (ftsY) 



lORF00820 ABC transporter, substrate-binding protein 



pRF00821 ABC transporter, permease protein, putative 



|ORF00824 transcriptional accessory protein Tex, putative 



IORF00825 conserved hypothetical protein 



|ORF0Q828 HPr(Ser) kinase/phosphatase (hprK) 



pRFQ0830 prolipoprotein diacylglyceryl transferase (igt) 



1ORF00832 consented hypothetical protein 



1QRF00835 peptidase, U32 family, putative 



ORF00836 peptidase, U32 family 



ORF00837 conserved hypothetical protein 



ORFQ0844 lysyl-tRNA synthetase (lysS) 



jQRF00846 phosphoglycerate mutase family protein 



ORF00847 ebsC family protein, putative 
ORF00850 peptidase, U32 family 



ORF00855 oligoendopeptidase F, putative 



QRF00856 phosphoenolpyruvate carboxylase (ppc) 



ORF00859 cell division protein, FtsW/RodA/SpoVE family (ftsW) 



ORF00861 translation elongation factor Tu (tuf) 
ORF00863 triosephosphate tsomerase (tpiA) 
ORF00865 phosphoglycerate mutase (gpmA~ 



ORF00867 recombination protein RecR (recR) 



|ORF00868 D-alanine-D-alanine ligase ________ __ ; — -=r- 

ORF00869 UDP-N-acetylmuramoylalanyl-D-glutamy^^iaminopimelate-D-alanyl-D-a lanvl ligase (murF) 



lORF00870 oxalate:formate antiporter 



ORF00871 membrane protein, putative 

ORF00873P ptide chain releas factor 3 (prfC) 

ORF00876 ABC transporter. ATP-binding protein 

IORFOO88O ATP-dependent RNA helicase, DEAD/DEAH box family 
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ORF00882 conserved hypothetical protein 
ORF00883 conserved hypothetical protein 
ORF00884 acyltransferase family protein 



ORF00885 competence protein CelA (celA) ' — 

nRFno887 DNA intern alization-related competenc e protetn ComEC/Kecz 



UKrUUOO' unn inwiiiqi« .w»v" i r — • 

ORF00889 sugar-binding transcriptional regulato r. Lacl family 
ORF00892 DNA polymerase HI, delta subunit, putativeD_ 
ORF00893 superoxide dismutase, Fe-Mn (sodA) 



ORF00894 transcriptional antiterminator UCT 



ORF00895 PTS system, beta-glucosides-specific IIABC cor 
ORF00896 6-phospho-beta-glucosidase (bgIA) 



ORF00899 glvcerate kinase 2 (garK) . — 

OR FQQ904 S-adenosylmethionine:tRNA ribosvltransfera se-isomerase (queA) 



OR FQ0906 olucosamine-6-phosphate isomerase (nagB) 



ORF00908 ribosomal small subunit pseudouridine 
ORF00911 competence protein CoiA(coiA) 



ORF00912 oligoendopeptidase B (pepB) 



ORF00914 O-methyltransferase family protein 



ORF00916 protease maturation protein, putative 



QRFQ0919 alanyl-tRNA synthetase (alas; 



ORF00 92S transcriptional regulator. Cro/CI family . 

QRFQQ 928 ribonucleoside-diphosphate redu ctase 2, beta subunit (nrdF) 
OR F00929 ribonucleoside-diphosphate red uctase 2. alpha subumt (nrdE) 
'QRF0093Q ribonucleoside-diphosphate reductase 2, NrdH-redoxin (nrdH)_ 



QRF0Q931 phosphocarrier protein HPr (ptsH) 

nRFnnQ32 nhosp hoenolpyruvate-protein phospho transferase (ptsl) ^_ 

ORF0Q933 glyreraldehyde-3-p hosphate dehy d rogenase, NAOP^dependent (gap_N)_ 



ORFQQ934 polysaccharide deacetylase family protein 



ORF0Q935 ATP-dependent RNA helicase, PEAD/DEAH boxfamily 



ORF00936 uridine kinase (udk) 



ORF00937 conserved hypothetical protein 

ORF00938 DNA polymerase III, gamma and tau su bumts (dnaX) 
QRF00940 biotin-acetyl-CoA-carboxylase ligase 



ORF 00941 S-adenosylmethionine synthetase (metK, 
ORF00955 UDP-N-acetylglucosamine 1-carboxyvin vltransferase (murAL 
ORFQQ956 acetyltransferase, GNAT family 



ORFQ0957 CBS domain protein 



ORF00958 methionine aminopeptidase. type I (map 
ORF00959 ribonuclease BN, putative 
ORF00962 conserved hypothetical protein 
ORF00963 DNA ligase, NAD-dependent (ligA) 



QRF00964 BrnrU protein, putative 



iORF00966 pullulanase. putative 
ORF00973 ATP synthase FO. A subunit(a 
ORF00974 ATP synthase FO, B subunit (al 
ORF00975 ATP synthase F1 , delta subunit (atpH) 



QRF0Q976 ATP synthase F1, alpha subunit (atpA) 
ORF00977 ATP synthase F1 , gamma subunit (atpG) 
ORF00978 ATP synthase F1, beta subunit (atpD) 
ORF00979 ATP synthase F1 . epsilon subunit (atpC) 



ORF00981 UDP-N-actetytplucosamlne 1-carboxyvinyltransfera se (murA) 

ORF00983 DNA-entry nuclease (endA) 

ORF00984 phenylalanyl-tRNA synthetase, alpha subunit (pheS) 
ORF00986 phenylalanyl-tRNA synthetase, beta subunit (pheT)_ 
QRF00988 exonuclease RexB (rexB) 
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ORFxxxxx Annotation 



ORF00989 exonuclease RexA (rexA) 



ORF00991 tRNA modification GTPase TrmE (trmE) 



ORF00992 ABC transporter, ATP-binding protein 



richMrirnnenase. thy mine PPi dependent £1 compon ent, alpha subunit 



nRFnn994 acetoi n dehydrogena se, thymine PPi dependent, fci component, beta subunit 
ORF00995 acetoin dehydrogenas e, thymine PPi dependent, EZ component, gj^gj^gg |- 



ORF00996 acetoin dehydrogenase, thymine PPi de pendent. E3 comr 
ORF00997 lipoate-protein iigase A (IplA) 



onent, dihydrolipoamide dehydrogenase 



ORF0Q998 cobyric acid synthase, putative 



QRF00999 mur Iigase family protein 



ORF01000 conserved hypotheticai protein TIGR00159 



ORF01001 expressed protein of unknown function 



nRFMon? phosphoQluc omutase/phosphomannom utase family protein 

r — - _. i *. » AM k t >rmA^n in nviHAQA nntath 



ORF01005 oxygen-independent coproporphyrinogen 111 oxidase, putative 



ORF01006 conserved hypothetical protein 



ORF01007 hydrolase, haloacid dehalogenase-like family 



ORF01008 conserved hypotheticai protein 



ORF01023 GTP-binding protein LepA (lepA) 



ORF01027 PilB-related protein 



QRF01030 cation-transporting ATPase, E1-E2 family 



ORF01033 conserved hypotheticai protein 



.ORF01040 Tn916, tetracycline resistance protein (tetM) 
1ORF01057 transcriptional regulator. GntR famiiy 



iORF01058 DNA polymerase III, alpha subunit (dnaE) 



ORF01059 6-phosphofructokinase (pfk) 



ORFQ1060 pyruvate kinase (pyk) 



ORF01063 glucosamine-fructose-6-phosphate amin otransferase (isomerizing) 
ORF01 066 phnA protein (phnA) 



(glmS) 



IORF01068 amino acid ABC transporter, permease protein 
ORF01069 amino acid ABC transporter, ATP-binding protein 



loRF01070 amino acid ABC transporter, amino acid-binding protein 



IORF01072 ribosomal protein S20 (rpsT) 



I ORF01073 pantothenate kinase (coaA) 
IORF01074 c onserved hypothetical protein 



ORF01075 cytidine deaminase (cdd) 

ORF01076 expressed putative lipoprotein _ 

1ORF01077 sugar ABC transporter, ATP-binding protein 



fORF01078 sugar ABC transporter, permease protein, putative 



ORF01079 sugar ABC transporter, permease protein, putative 



{ORF01080 NADH oxidase (nox-2) 
ORF01081 L-lactate dehydrogenase (Idh) 
1ORF01082 DNA gyrase, A subunit (gyrA) 



IORF01083 sortase SrtA (srtA) 



PRF01089 GMP synthase (guaA) 



1ORF01090 transcriptional regulator. GntR family 
ORF01091 gid protein (gid) 



IORF01093 expressed putative lipoprotein 



ORF01097 ABC transporter, ATP-binding protein 



|ORFQ1099 DNA-binding response regulator 



ORF01101 site-specific recombinase, phage integrase family 

ORF01106 signal recognition particle protein Ffh (ffh) 

ORF01108 conserved hypothetical protein 



lORFO1 109 sensor histidine kinase CiaR 



ORF01110 DNA-binding response regulator CiaR (ciaR) 



loRF01111 aminopeptidase N (pepN) 
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ORFxxxxx Ann tati n 



ORF01112 phosphate transport system regulatory protein PhoU (phoU) 



ORF01 113 phosphate ABC transporter, ATP-binding protein PstB, putative 



ORF01 1 14 phosphate ABC transporter, ATP-binding protein PstB t putative 
ORF01115 phosphate ABC transporter, permease protein PstA t putative 



QRF01 1 16 phosphate ABC transporter, permease protein 



ORF01 117 phosphate ABC transporter, phosphate-binding protein 



ORF01 1 1 8 NQL1/NOP2/sun family protein 



ORF01119 inositol monophosphatase family protein 



ORF01120 conserved hypothetical protein 



ORF01121 conserved hypothetical protein 



ORF01122 macrolide-efflux protein mreA/ribofiavin biosynthesis protein RibF 



ORF01 123 tRNA pseudouridine synthase B (truB) 



ORF01 125 conserved hypothetical protein 



ORF01128 permease, putative 



ORF01129 ABC transporter, ATP-binding protein 



ORF01131 DNA topoisomerase I (topA) 



ORF01 132 DprA/SMF protein, putative DNA processing factor (dprA) 



QRF01 134 iron compound ABC transporter, ATP-binding protein 



ORF01137 acetyltransferase, CysE/LacA/LpxA/NodL family 



ORF01138 ribonuclease HH (mhB) 



ORF01139GTP-binding protein 



ORF01 176 carbamoyl-phosphate synthase, large subunit (carB) 



ORF01177 carbamoyi-phosphate synthase, small subunit (carA) 



ORF01178 aspartate carbamoyltransferase (pyrB) 



ORF01179 dihydroorotase, multifunctional complex type (pyrC) 



ORF01180 orotate phosphoribosyltransferase (pyrE) 



ORF01181 orotidine 5-phosphate decarboxylase (pyrF) 



|ORF01183 ABC transporter, ATP-binding protein 



IORF01184 ribonucleotide reductase, truncation 
l oRFO1 1 88 cardiolipin synthetase (els) "~~ 



ORF01 189 formate-tetrahydrofolate llgase (fhs) 



| QRF01 190 lipoate-protein ligase A (IplA) 



QRF01198 flavoprotein-related protein 
QRFQ1199 fiavoprotein family protein 



iORF01200 membrane protein, putative 
ORF01201 phosphoglucomutase (pgm) 



IORF01203 1S861, transposase OrfB 



ORF01205 ABC transporter. ATP-binding/permease protein 



1 ORF01206 ABC transporter, ATP-binding/permease protein 



1ORF01207 conserved hypothetical protein 



IQRF01208 conserved hypothetical protein 



ORFQ1209 Serine hydroxymethyltransferase 
1 ORF01210 SuaSA'ciO/YrdC/YwlC family proteiTT 
ORF01211 modification methylase, HemK family 



ORF01212 peptide chain release factor 1 (prfA) 



1ORF01213 thymidine kinases (tdk) 



ORF01214 4-oxalocrotonate tautomerase (xylM) 



I QRF01216 ApbE family protein 



1ORF01220 xanthine permease (pbuX) 



ORF01221 xanthine phosphoribosyltransferase (xpt) 



ORF01222 guanosine monophosphate reductase (guaC) 



IORF01227 phosphate acetyltransferase 



ORFQ1228 ribosomal large subunit pseudouridine synthase. RluD subfamily 



IORF01229 expressed protein of unknown function 



ORF01230 GTP pyrophosphokipase family protein 
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ORF01231 conserved hypothetical protein 



ORF01232 ribose-phosphate pyrophosphokinase (prsA) 



ORF01233 cysteine desulphurase (iscS) 



ORF01234 conserved hypothetical protein 



ORF01235 conserved hypothetical protein 



ORF01236 DNA repair protein RadC (radC) 



ORF01238 6-phospho-beta-glucosidase (ascB) 



ORF01239 platelet activating factor, putative 



QRF01240 hydrolase, haloacid dehalogenase-like family 



QRF01242 voltage-gated chloride channel family protein 



QRFQ1243 spermidine/putrescine ABC transporter, spermidine/putre scme-binding protein (potP) 
ORF01244 spermidine/putrescine ABC transporter, permease protein (potC) : 



ORF01245 spermidine/putrescine ABC transporter, permease p rotein (potB) 



ORF01246 spermidine/putrescine ABC transporter, ATP-binding protein (potA) 

1ORF01247 UDP-N-acetylenolpyruvoylglucosamine reductase (murB) 

1ORF01248 2-amino-4-hydroxy-6-hydroxymethyldihvdropteridine pyrophosph okinase (folK) 
ORFQ1 250 dihydropteroate synthase (folP) 



|ORF01251 GTP cyclohydrolase I (folE) 

ORF01252 folylpotyglutamate synthase (folC) 



ORF01259 aldehyde dehydrogenase family protein 



| ORF01260 membrane protein 
IORF01274 gls24 protein, putative 



|ORF01276 gls24 protein, putative 



ORF01279 conserved hypothetical protein 



IORF01282 ATP-dependent DNA helicase PcrA (pcrA) 



ORF01283 consented hypothetical protein, FRAMESHIFT 
1ORF01284 uracil permease (uraA) 



ORF01285 sodiumialanine symporter family protein 



IORF01286 cation efflux family protein 



ORF01290 ribosomal protein S1 (rpsA) m 

ORF01292 branched-chain amino acid aminotransferase (ilvE) 



IORF01294 DNA topoisomerase IV, A subunit (parC) 



|ORF01295 DNA topoisomerase IV, B subunit (parE) 



QRF01296 membrane protein, putative 

ORF01297 uracil-DNA glycosylase (ung) 

IORF01317 transcriptional regulator, LysR family, putative^ 



1ORF01319 purine nucleoside phosphorylase (deoP) 



loRF01321 purine nucleoside phosphorylase (deoP) 



|ORF01323 phosphopentomutase (deoB) 



ORF01324 ribose 5-phosphate isomerase (rpiA) 



ORF01327 tributyrin esterase (estA) 



ORF01328 metallo-beta-lactamase superfamily protein 
I ORF01329 ABC transporter, ATP-binding protein 



ORF01330 ABC transporter, permease protein 



ORF01331 conserved hypothetical protein 



ORF01332 adherence and virulence protein A (pavA) 



lORF01335 TPR domain protein 



|ORF01336 membrane protein 



ORF01338 mutator MutT protein (mutX) 
|ORF01339 hyaluronidase ~~ 



ORF01343 Iminodiacetat oxidase, putative 



ORF01344 conserved hypothetical protein TIGR00486 



IORF01345 conserved hypothetical protein 



ORF01346 DNA replication protein Dnad. putative 



1ORF01347 adenine phosphoribosyltransferas (apt) 



11 



% 



Table 8: GBS genes shar d with GAS and pneum coccus 



ORFxxxxx Annotatl n 



ORF01350 single-stranded-DNA-specific exonucleas RecJ (recJ) 



ORF01351 oxidoreductase, short chain dehydrogenase/reductase family 
loRF01352 metallo-beta-lactamase superfamily protein 



1ORF01353 consented hypothetical protein 



ORFQ1354 GTP-binding protein HfIX (hflX) 



ORF01355 tRNA delta(2)-isopentenylpyrophosphate transfe rase (miaA) 



ORF01357 exfoliative toxin A, putative 
ORFOISSSpullulanase, putative 



IORF01362 conserved hypothetical protein 



1ORF01363 peptidase, M20/M25/M40 family 
I ORF01 364 nitroreductase family protein ~ 



IORF01367 excinuclease ABC, C subunit (uvrC) 



IORF01380 streptococcal histidine triad family protein 
1ORF01 381 laminin-bjnding surface protein (Imb) 



ORF01397 Tn5252, relaxase 
loRF01403 mercuric reductase (merA) 



ORF01406 IS861. transposase OrfB, truncation 



ORF01407 cation-transporting ATPase, E1-E2 family 



|ORF01411 conserved hypothetical protein 



1ORF01412 cation-transporting ATPase. E1-E2 family 



loRF01415 transcriptional repressor CopY t putative 



ORF01416 cadmium resistance transporter, putative 



ORF01451 C-5 cytosine-specific DNA methylase 



1QRF01453 conserved hypothetical protein 



QRF01455 ribosomal protein L7/L12 (rpIL) 



IORF01456 ribosomal protein L10 (rplJ) 



ORF01458 ATP-dependent Clp protease, ATP-binding subunit 



QRF01467 GTP-binding protein (cgpA) 



IORF01468 ATP-dependent Clp protease, ATP-binding subunit ClpX (dpX) 
loRF01470 dihydrofolate reductase (folA) 

. ORF01471 thymidylate synthase (thyA) — . 

1ORF01472 HMG-CoA synthase 

IORF01473 3-hydroxy-3-methylglutaryl-CoA reductase 
IORFQ1474 conserved hypothetical protein 



IORF01475 hemolysin 111, putative 

1ORF01476 conserved hypothetical protein T1GR00147 



IORF01479 isopentenyl-diphosphate delta-isomerase 
lORF01480 phosphomevalonate kinase 



IORF01481 diphosphomevalonate decarboxylase (mvaD) 
|ORF01482 mevalonate kinase, putative 



ORF01484 DNA-binding response regulator 
ORF01491 polypeptide deformylase, putative" 



IORF01495 ABC transporter, ATP-binding/permease protein 



ORF01496 ABC transporter, ATP-binding/permease protein 



ORF01498 ABC transporter, ATP-binding protein 



1ORF01499 polyA polymerase family protein 



1ORF01500 DegV family protein 



ORF01501 expressed protein of unknown function 



ORF01504 PTS system, fructose specific II ABC components 



ORF01505 1-phosphofructokinas (fruK) 



ORF01506 lactose phosphotransferase system repressor (lacR) 



IORFQ15Q7 beta-lactam resistance factor 



ORF01511 pyridine nucleotide-disulphide oxidoreductase family protein 



,ORF01512 tRNA (guanine-N1)-methy1transferase (trmD) 
IORF01513 16S rRNA processing protein RimM (rimM)~ 
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IORF01515 transcriptional regulator, RofA family 



QRF01 516 KH domain protein 
jORF01517 ribosomal protein S16 (rpsP) 
1 ORF01518 permease, putative 
IORF01519 ABC transporter, ATP-binding p rotein 
1ORF01520 conserved hypothetical protein 



1ORF01523 carbamoyl-phosphate synthase, small subunit (carA) 
1qrfq1524 p yrimidine operon regula tory protein fovrR) 
ORF01525 ribosomal lame subunit pseudou ridine svnthaseTRtuP subfamily, 
ORF01526 lipoprotein signal peptidase (IspA) 
1ORF01527 transcriptional regulator, LysR family 
ORF01528 ribosomal protein L27 (rpmA) 
IQRF01529 conserved hypothetical protein 



loRF01530 ribosomal protein L21 (rplU) 



nppniftti nnnserved hy pothetical protein. FRAMESHIFT 



|ORF01532 thiamine biosynthesis protein Thil (thil) 
1ORF01 533 cysteine desulphurase (iscS)^ 
ORF01536 glutathione reductase (gor) 



ORF01537 conserved hypothetical protein 



IORF01538 chorismate synthase (aroC) 



IORF01539 3-dehydroquinate synthase (aroB) 



1ORF01540 3-dehydroguinate dehydratase (aroD) 
loRF01541 conserved hypothetical protein 



I ORFQ1543 ribosomal protein L20 (rplT) 



10RFQ1544 ribosomal protein L35 {rpml) 
loRFQ154 5 translation initiation factor lF-3 (infC) 



1ORF01546 cytidylate Kinase (cmR) 



lORF01548 ferredoxin, 4Fe-4S 



|ORF01550 peptidase t (pepT) 



IQRFQ1S51 p olysaccharide biosynthesis protein, putative — , » r=Z=?7ZZZe\ 

lnRFOi 552UDP-N^tylmuramo vialan^ 

1ORF01553 iron compound ABC transporter. ATP-bindin q protein (fepC) 
1ORF01555 iron compound ABC transporter, permeasejarotein 



iur\rv iuijj nvni wiiipyMiiv » w ~ — ■ » r — — 

[ORF 01556 iron compound ABC transporter, perme ase protein 

iane&e-denendenl 



IQRF01 558 inorganic pyrophosphatase, mangane se-dependent j 
i formate -lyase-activating enzyme (pflA) 



IORF01559 pyruvate 1 



loRF01560 CBS domain protein 



loRF01561 conserved hypothetic al protein 
IORF01564 PAP2 family protein 



loRF01565 membrane protein, putative 



1ORF01567 expressed sortase family protein 



|ORF01568 sortase family protein 



I QRF01571 rogB protein FRAMESHIFT (rogB) 
IORF01 587 conserved hypothetical protein 
ORF01589 RNA polymerase slgm a-70 fecton 
ORF01 590 DNA primase (dnaG) 

ORF01591 l arge conductance mechanosensitive channe l protein (mscL) 



ORF01592 ribosomal protein S21~(rpsU) 

QRFQ1594 amino acid ABC transporter, amino acid-Dinding protein 



ORF01598 rhodanese family protein 



ORF01602 glycogen phosphorylase (glgP) 



ORF01603 4-alpha-glucanotransferase (malQ) 



ORF01604 maltose operon rep ressor MalR. putative 

ORF01605 maltose/maltodextrin ABC transporter, maltose/ma ltodextnn-binding protein, 
ORF01606 maltose ABC transporter, permease protein . 
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ORFxxxxx Annotati n ■ — 

ORF01607 maltose ABC transporter, permease protein . — 

ORF01614 preprotein translocase SecA subunit putative , — 

ORF01619 preprotein translocase SecY family protein 

ORF01634 excinuclease ABC, B subunit (uvrB) . 

ORF01636 glutamine ABC transporter, glutamine-binding protein/permease protein (glnP) 

ORF01637 glutamine ABC transporter, ATP-binding protein, GlnQ putative 

QRF01640 GTP-binding protein, GTP1/Obg family (obg) — 

ORF01646 amidase family protein = — 

ORF01647 ribosomal smaii subunit pseudouridine synthase A (rsuA) m 

ORF01648 oxidoreductase, aldo/keto reductase family 

QRFQ1651 lactoylglutathione lyase (gloA) 

ORFQ1652 glycosyl transferase, group 2 family protein m _ 

ORF01654 SsrA-binding protein (smpB) m 

ORF01655 exoribonuclease, VacB/Rnb family (vacB) _ 

ORF01657 preprotein translocase. SecG subunit m 

ORF016S8 multi-drug resistance protein _ 

ORF01662 dephospho-CoA kinase . „ . 

ORF01663 formamidopyrimidine-DNA glycosylase (mutM) _ . 

ORF01677 GTP-binding protein Era (era) 

ORF01678 dlacylglycerol kinase (dgkA) _ 

ORF01679 conserved hypothetical protein TIGR00043 

ORF01685 PhoH family protein , 

ORF01687 conserved hypothetical protein . 

ORF01689 conserved hypothetical protein . 

ORF01690 ribosome recycling factor (frr) ; „ — _ 

ORF01691 uridylate kinase (pyrH) . 

ORF01693 peptide ABC transporter, ATP-binding protein FRAMESHIFT 

ORF01697 ribosomal protein L1 (rplA) 

ORF01698 ribosomal protein L11 (rplK) m _ 

QRF01706 IS861, transposase OrfB m , 

ORF01707 chorismate binding enzyme _ 

ORF01708 FtsK/SpolllE family protein 

ORFQ1709 peptidyl-prolyl cis-trans isomerase, cyclophilin-type 

QRF0171Q manganese ABC transporter, permease protein 

ORF01711 manganese ABC transporter, ATP-binding protein 

ORF01712 manganese ABC transporter, manganese-binding adhesion liprotein 

ORF01 71 3 iron-dependent transcriptional regulator ^ 

ORF01714 5-methylthioadenosine nucleosidase/S-adenosylhomocysteine nu cleosidase (pfs) 

ORF01716 MutT/nudix family protein 

ORF01718 UDP-N-acetylglucosamine pyrophosphorylase (glmU) 

ORF01722 oxidoreductase, Gfo/ldh/MocA family a 

ORF01725 gluconate 5-dehydrogenase t putative 

ORF01726 conserved hypothetical protein 

ORF01738 branched-chain amino acid transport system II carrier protein (brrtQ) 

ORF01739 methionyl-tRNA synthetase (metG)» 

ORF01745 exodeoxyribonuclease (exoA) , — 

ORF01746 conserved hypothetical protein 

ORF01752 copper homeostasis protein CutC, putative , . 

ORF01755 tetrapyrrole methylase family protein m 

ORF01756 conserved hypothetical protein 

ORF01758 DNA polymerase III, delta prime subunit, putative . 

ORF01759 thymidylate kinase (tmk) 

ORF01773 ATP-dependent Clp protease, proteolytic subunit CIpP (dpP) 

ORF01774 uracil phosphoribosyltransferase (upp) . 

ORF01777 RNA methyitransferase. TrmH family, group 2 . 
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Tab! 8: GBS genes shared with GAS and pneumoc ecus 



ORFxxxxx Annotati n 



^Fni7ft i conserved hypothetical prot ein T1GR00278 
pcni7ft9 rihnQomal large subunit pseudouridine synthase B (rtuB) 



RF01783 conserved hypothetical protein TIGR00281 



ORF01784 conserved hypothetical protein 



ORF01785 intearase/recombinase, pha ge integrase family 



ORF01786 CBS domain protein 



QRF01787 conserved hypothetical protein 



QRF01788 HAM1 protein 



QRF01789 glutamate racemase (murl) 



ORF01791 membrane protein, putative 



QRFQ1792 transcriptional regulator, biotin repressor family 



ORF01793 membrane protein, putative 



ORF01795 RNA methyltransferase, TrmH family 



ORF01796 acvlphosphatase 



ORF01797 lipoprotein, putative 



ORFQ1799 amino acid ABC transporte r, permease protein 



ORF01801 amidase family protein 



ORF01 802 transcription elongation factor GreA (greA) 



ORF01803 conserved hypothetical protein 



ORF01804 acetyltransferase, GNAT family 



ORF01 805 UDP-N-acetylmuramate-alanine ligase (murC) 



ORF01806 conserved hypothetical protein 

ORF01808 expressed putative helicase 

ORF01811 phosphogly cerate dehydrogenase-related protein 



ORF01812 primosomal protein Dnal (dnal) 
ORF01813 conserved hypothetical protein 



ORF01814 conserved hypothetical protein TIGR00244 



QRF01815 sensor histidine kinase CsrS (csrS) 



Uf\rU IP lu OP' i JWi nwuwtiv _ ^ , ____ 

QRF01816 DNA-binding response regulator CsrR (csrR) 



ORF01817 conserved hypothetical protein 



ORF01818 heat shock protein HtpX (htpX) 



ORF01820 lemA protein (lemA) 



ORF01821 plucose-inhibited division protein B (gidB) 



ORF01822 sodium transport family protein 



ORF01823 potassium uptake protein, Trk family, putative 



ORF01825 ABC transporter. ATP-binding protein 

ORF01828 branched-chain amino acid transport system II carrier protein (brnQL 



QRF01829 alcohol dehydrogenase, zinc-containing (adh) 



ORF01830 ABC transporter, permease protein 



ORF01831 ABC transporter. ATP-binding protein 



QRF01833 expressed YaeC family protein 



QRF01834 ABC transporter, substrate-binding protein 



ORF01835 glutamine amidotransferase, class I 



ORF01837 conserved hypothetical protein T1GR01033 



ORF01846 glycerol uptake facilitator protein (glpF) 



ORF01 849 conserved hypothetical protein 



ORF01851 conserved hypothetical protein 



ORF01852 ioiap-related protein 



ORF01854 conserved hypothetical protein TIGR00488 



ORF01855 conserved hypothetical protein TIGR00482 
ORF01856 conserved hypothetical protein TIGR00253 



ORF01 857 GTP-binding protein 
ORFQ1858 hydrolase, haloacid dehalogenase-like family 



ORF01860 giutamyl-tRNA(Gln) amidotransferase, B subunit (gatB) 



ORF01861 glutamyl-tRNA(Gln) amidotr ansferase, A subunit (gatA) 
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ORFxxxxx Annotati n 



ORF01862 glutamyl-tRNA(Gln) amidotransferase . C subunit (gate) 

ORF01867 isochorismatase family protein 

ORF01869 transcriptional regulator CodY t putative 



ORF01870 aminotransferase, class \ 



ORF01871 universal stress protein family FRAMESH1FT 



ORF01872 hydrolase, haloacid dehalogenase-like family 



ORF01873 asparaginase family protein 



ORF01874 shikimate 5-dehydrogenase (aroE) 



JORF01876 ATP-dependent DNA helicase RecG (recG) 
IORF01878 alanine racemase (air) 



ORF01879 holo-(acyl-carrier-protein) synthase (acpS) 



ORF01881 preprotein translocase, SecA subunit (secA) 



1ORF01882 mannose-6-phosphate isomerase. class I (manA) 
1ORF01883 fructokinase (scrK) 
IORF01885PTS system IIABC components 



ORF01886 su crose-6-phosphate hydrolase (scrB) 



1QRF01887 sucrose operon repressor ScrR (scrR) 



ORFQ1888 N utilization substance protein B (nusB) 



I ORF01889 conserved hypothetical protein 



ORF01890 translation elongation factor P (efp) 



PRF01900 cytidine/deoxycytidylate deaminase family protein 



ORF01906 excinuclease ABC, A subunit (uvrA) 
1ORF01907 conserved hypothetical protein 
loRF01908 magnesium transporter. CorA family (corA) 
ORF019Q9 ribosomal protein S18 (rpsR) 
1ORF01910 single-strand binding protein (ssb 
| QRF01 911 ribosomal protein S6 (rpsF) 
1ORF01912 A/G-specffic adenine glycosylase (mutY 
|oRF01 914 thioredoxin (trx) ~ 
IORF01915 PAP2 family protein 



loRF01916 MutS2 family protein 



ORF01917 conserved hypothetical protein 



1ORF01918 conserved hypothetical protein 
loRF01919 ribonuclease Hill (rnhC) 
I OR F0 1920 signal peptidase I 



1ORF01921 helicase, putative 



1ORF01923 DNA-damage inducible protein P (dinP) 



loRF01924 formate acetyltransferase (pflD) 



IORF01926 conserved hypothetical protein 

1ORF01927 proteinase, putative, degenerate, F RAMESHIFT 
1ORF01929 glycerol uptake facilitator protein, putative 



|ORF01930 universal stress protein family 

1ORF01933 X-pro dipeptidyl-peptidase (pepX) 

ORFQ1937 ABC transporter, ATP-binding protein CydC (cydC) 



ORF01938 ABC transporter, ATP-binding protein CydP 



loRF01945 conserved hypothetical protein TIGR00103 



l ORFOI 948 exonuclease 

1ORF01949 conserved hypothetical protein" 



jORFQ1950 conserved hypothetical protein TIGR00275 



1QRF01952 ribosomal protein S14 (rpsN) 
]ORF01957 O-sialoglycoprotein endopeptidase family protein 



1ORF01958 ribosomal-protein-alanine acetyltransferase, putative 
|ORF01960 expressed protein of unknown function 



IQRF01961 conserved hypothetical protein 



ORFQ1962 metallo-beta-lactamase superfamily protein 
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Table 8: GBS genes shared with GAS and pn umoc ecus 

ORFxxxxx Annotation 

ORF01963 conserved hypothetical protein 

ORF01964 glutamine synthetase, type I (gin A) 

ORF01965 transcriptional regulator GlnR (glnR) , 

ORF01967 consented hypothetical protein m . 

ORF01969 phosphoglycerate kinase (pgk) 

ORF01971 glyceraldehyde 3-phosphate dehydrogenase (gap) 

ORFQ1972 translation elongation factor G (ftisA) 

ORF01973 ribosomal protein S7 (rpsG) 

ORF01974 ribosomal protein S12 (ipsL) 

ORF01975 pur operon repressor (purR) 

ORF01976 HP domain protein _ 

ORF01977 conserved hypothetical protein 

ORF01978 conserved hypothetical protein m 

ORF01979 ribulose-phosphate 3-epimerase (rpe) 

ORF01980 conserved hypothetical protein T1GR00157 a 

ORF01983 dimethyladenosine transferase (ksgA) 

ORF01985 primase-related protein __ 

ORF01987 deoxyribonuclease, TatD family m 

ORF01992 dltD protein (dltD) m 

ORF01993 P-alanyl carrier protein (dltC) 

ORF01994 dltB protein (dltB) 

QRF01996 P-alanine-activating enzyme (dltA) _ __ 

ORF01997 sensor histidine kinase __ 

ORF01998 DNA-binding response regulator 

ORF01999 ribosomal protein L34 (rpmH) 

ORF02004 amino acid ABC transporter, ATP-binding protein . 

ORF02007 conserved hypothetical protein 

ORF02008 transcriptional antiterminator, BgIG family 

ORF02017 sugar binding transcriptional regulator. Lad family 

ORF02018 transaldolase family protein __ 

ORF02019 carbohydrate isomerase, AraD/FucA family m 

QRF02020 hexulose-6-phosphate isomerase, putative 

QRF02021 hexulose-6-phosphate synthase, putative 

ORF02Q22 PTS system, IIA component 

ORF02023 PTS system, IIB component 

ORF02024 transport protein SgaT, putative 

ORF02027 adenylosuccinate synthetase (purA) _ 

ORF02033 chaperonin, 33 kPa (hsIO) 

ORF02034 NifR3/Smm1 family protein 

ORF02037 ATP-dependent Clp protease, ATP-binding subunit 

ORF02038 transcriptional regulator CtsR (ctsR) . 

ORF02Q40 translation elongation factor Ts (tsf) 

ORF02041 ribosomal protein S2 (rpsB) 

ORF02043 alkyl hydroperoxide reductase, subunit F (ahpF) 

ORF02076 prophage LambdaSa2, single-strand binding protein (ssb) 

ORF02082 prophage LambdaSa2, type II PNA modification methyltransferase, putative 

ORF02086 prophage LambdaSa2, replicative PNA helicase (dnaC) 

ORF021Q4 endopeptidase O (pepO) 

ORFQ2110 polypeptide deformylase (def) 

ORF02111 sugar binding transcriptional regulator RegR (regR) 

ORF02112 conserved hypothetical protein 

ORF02113 PTS system, HP component 

QRFQ2114 PTS system. HC component . 

ORF02115 PTS system. IIB component 

ORF02116 glucuronyl hydrolas u 



17 



% 



Tabl 8: GBS g nes har d with GAS and pn umococcus 



ORFxxxxxAnn tation 



ORF021 18 PTS system. IIA component 



) lORF02120 oxidoreductase, short-chain dehydrogenase/reductase family 



IORF02121 conserved hypothetical protein 



IORF02122 c arbohydrate kinase, PfkB family _ ... 

1ORF02123 2-dehydro-3-de oxyphosphogluconate aldolase/4-hydroxv-2-oxoglutar a te aldolase (eda) 

IORF02127 DNA polymerase 111, alpha subunit, Gram-positi ve type _ 

|oRF02129 prolyl-tRNA synthetase (proS) _ — 

loRF02130 membrane-associated zinc metalloprotease f putative m . 

1ORF02131 phosphatidate cytidyly transferase (cdsA) 



ORF02132 undecaprenyl diphosphate synthase (uppS) 



ORF02133 preprotein translocase, YajC subunit (yajC) 



loRF02140 glucan 1,6-alpha-glucosidase (dexB) 

I6RF02141 sugar ABC transporter, ATP-binding protein (msmK) 
loRF02142 helix-turn-helix domain protein, fts-type 



|oRF02144 tagatose 1 t 6-diphosphate aldolase (lacD) 
l ORF02145 tagatose-6-phosphate kinase (lacC) 



IQRFQ2146 gala ctose-6-phosphate isomerase. LacB subunit (lacB) 



ORF02147 galactose-6-phosphate isomerase, LacA subunit (lacA) 



ORF02149 PTS system, HC component, putative 



ORF02150 PTS system, IIB component, putative 



IORF021 52 PTS system, IIA component putative 

ORF02153 lactose phosphotransferase system repressor (lacR) 



|ORF02157 adhesion lipoprotein 



ORF02158 expressed protein of unknown function T1GR00256 



1ORF02159 GTP pyrophosphokinase (relA) 



ORF02161 nrdl protein (nrdl) 



ORF02164 iron ABC transporter, iron-binding protein 



ORF02165 DNA-binding response regulator 
1ORF02167 PTS system, IIP component 
1ORF02168 PTS system, »C component 



1ORF02174 ABC transporter, ATP-binding protein 



ORF02176 response regulator 



1ORF02177 conserved hypothetical protein 



.ORF02178 PTS system, IIABC components 
1ORF02179 sensor histidlne kinase 



ORF02180 phosphate regulon response regulator PhoB (phoB) 



[QRFQ2182 phosphate ABC transporter, ATP-binding protein (pstB) 



lORF02183 phosphate ABC transporter, permease protein 



ORF02184 phosphate ABC transporter, permease protein 



IORF02188 conserved hypothetical protein TIGR00046 



ORF02189 ribosomal protein L11 methyltransferase (prmA) 
JORF02197 conserved hypothetical protein 
1 ORF021 99 ATPase, AAA family 
|ORF02249 mercuric reductase (merA) 



[ORF02272 PNA topology modulation protein FlaR, putative 



ORF02273 glycerol dehydrogenase, putative 



ORF02281 DNA-binding response regulator 



loRF02285 leucyl-tRNA synthetase (leuS) 



1ORF02290 transcription antitermination protein NusG (nusG) 



1ORF02293 penicillin-binding protein 2A (pbp2A) _ _ 

rbRF02294 ribosomal large subunit pseudouridine synthase. RluD subfamily 
|ORF02296 phosphopentomutase (deoB) — 
iORF02297 deoxyribose-phosphate aldolase (deoC) 



ORF02300 uridine phosphorylase (udp) 



|ORF02302 60 kda chaperonin (groEL) 
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ORFxxxxx Annotati n 



ORF02303 chaperonin, 10 kDa (groES) 



ORF02305 ABC transporter, ATP-binding protein 



1ORF02306 ABC transporter, permease protein 



1ORF023Q7 expressed putative lipoprotein 



|ORF02309 glyoxalase family protein 



ORF023 10 conserved hypothetical protein ^ 

loRF02311 anaerobic ribonucleoside-triphosphate reductase activati ng protein (nrdG) 



I ORF0231 2 acetyltransferase, GNAT family 

[ORF02315 anaerobic ribonucleoside-triphosphate reductase (nrdD) 
|ORF02318 conserved hypothetical protein 



1ORF02320 conserved hypothetical protein 



ORFQ2321 conserved hypothetical protein 



1ORF02322 recA protein (recA) 



ORF02325 DNA-3-methyladenine glycosylase I (tag) 



ORF02327 Holliday junction PNA helicase RuvA (ruvA) 



IORF02329 DNA mismatch repair protein HexB (hexB) 



QRF02333 arginlne repressor ArgR, putative 
loRF02334 arginyl-tRNA synthetase (argS) 



IORF02337 conserved hypothetical protein 



ORF02338 conserved hypothetical protein 



ORF02339 aspartyl-tRNA synthetase (aspS) 
IORF02340 histidyl-tRNA synthetase (hisS) 



IORF02342 ribosomal protein L33 (rpmG) 



ORFQ2357 DNA-binding response regulator 



ORF02359 membrane protein, putative 



IORF02360 carbamate kinase (arcC) 



ORF02361 ornithine carbamoyitransferase (argF) 



|ORF02364 amino acid ABC transporter, ATP-binding protein 



ORF02365 amino acid ABC transporter, permease and amino acid-binding protein 



ORF02370 membrane protein, putative 



ORF02371 transcriptional regulator, TetR family, putative 



1ORF02373 ribosomal protein S4 (rpsD) 



ORF02374 conserved hypothetical protein 



ORF02375 replicative PNA helicase (dnaC) 



1 ORF02376 ribosomal protein L9 (rpfi) 
IORF02377 DHH family protein ~ 



1 ORF02378 glucose inhibited division protein A (gidA) 

[ORF02380 tRNA (5-methylaminomethyl-2-thiouridylate)-methyltra nsferase (trmU) 

IQRF02381 L-serine dehydratase, iron-sulfur-dependent, betasubunit (s dhB) 

ORF02382 L-serine dehydratase, iron-sulfur-dependent, alpha s ubunit (sdhA) 

IORF02385 cobalt transport family protein . 

IORF02386 ABC transporter, ATP-binding protein , 

I ORF02387 ABC transporter, ATP-binding protein, FRAMESHIFT 

IORF02388 CDP-d1acylglycerol~giycerol-3-phosphate 3-phosphatidyltrans ferase (pgsA) 
IORF02389 peptidase, M16 family ~~ 



{ORF02390 conserved hypothetical protein 



IORF02391 conserved hypothetical protein 



1 ORF02392 recF protein (recF) 



ORF02396 inosine-S'-monophosphate dehydrogenase (guaB) 



ORF02397 transcriptional regulator, ArgR family 
pRF02400 arginlne delminase (arcA) 



1ORF02402 ornithine carbamoyitransferase (argF) 



ORF02404 carbamate kinase (arcC) 
ORF02405 try ptophanyl-tRNA synthetase (trpsT 
I ORF02407 conserved hypothetical protein 
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QRF02408 ABC transporter, ATP-binding protein 

ORF02409 ABC transporter, permease protein, putative 

ORF02410 conserved hypothetical protein T1GR00246 

ORF02411 serine protease 

ORF02412 partitioning protein. ParB famiiy 

ORF02413 chromosomal replication initiator protein PnaA (dnaA) 

ORF02415 DNA polymerase 111, beta subunit (dnaN) 

ORF02417 conserved hypothetical protein 

ORF02419 conserved hypothetical GTP-binding protein 

ORF02420 peptidyl-tRNA hydrolase (pth) 

ORF02421 transcription-repair coupling factor (mfd) 

QRF02423 S4 domain protein m 

QRF02424 cell division protein DivlC, putative 

ORF02426 expressed protein of unknown function 

ORF02427 MesJ/Ycf62 family protein 

ORF02429 cell division protein FtsH (ftsH) „ 
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IrMacrirtfK t ^u^^^i^rihrtcviamjnnimldayolecarhnyamide formvltransf erase/1 MK cvclonydrolase (purn) | 


lAhrAflnnc A/>»n>>\n/aH l"ii/nr\thOTlf*9l t^irrtifSII'^ 1 






ORF00044 threonine synthase (thrC) -J 


1ORF00081 ribosomal protein L17 (rplQ) . -| 


ORF00090 conserved hypothetical protein 1 


ORF00129 araininosuccinate synthase (argG) _ _J 


C 


DRF00156 oligopeptide ABC transporter, substrate-binding protein, putative J 


( 


3RF00189 protease, putative 1 


( 


3RFQ0194 thioredoxin family protein 


( 


DRF00195 tRNA binding domain protein ______ 1 


( 


3RF00217 conserved domain protein > . — I 


It 


DRF00218 PTS system, I IB component, putative —J 


|< 


DRF00220 transketolase, N-terminal subunit J 


I 


DRF00221 transketolase. C-terminal subunit 


I 


DRF00223 oxidoreductase, putative , 


I 


ORF00282 acetyltransferase, GNAT family 1 


I 


ORFQQ290 IS 1381, transposase OrfB J 


I 


ORF00291 IS1381, transposase OrfA — — j 




ORF00293 conserved hypothetical protein J 




ORF00301 membrane protein, putative , 1 




ORF00343 ABC transporter, permease protein, putative j 




ORF00344 conserved hypothetical protein 1 




ORF00382 aspartate kinase family protein _ j 




ORFQ0399 conserved hypothetical protein I 




ORF00439 ceil wall surface anchor family protein J 




ORF00447 cytldine/deoxycytidylate deaminase family protein j 




ORF00450 5-formyltetrahydrofblate cyclo-ligase family protein j 




ORF00480 transcriptional regulator, MerR family — 




ORF00499 acetyltransferase, GNAT family — j 




ORFOOSM magnesium transporter. CorA family _ _l 




ORF00521 VanZF domain protein . j 




ORF00612 IS1381 , transposase OrfA 1 




ORF00613 1S1381, transposase OrfB m j 




|ORF00690 transmembrane protein Vexpl (vexp _ 1 




InRPnnRQi arc transnorter ATP-blndina protein Vexp2 (vex2) 1 


IORF00692 transmembrane oroteln Vexo3 (vex3) 




IORF00714 conserved hypothetical protein 




IORF00732 expressed cell wall surface anchor family protein, putative I 




ORF00774 ABC transporter, ATP-binding protein j 




ORF00778 ABC transporter. ATP-binding protein J 




IORF00780 conserved hypothetical protein J 




|ORFQ0790 beta-glucurcfnidase — j 




ORF00800 alpha amylase family protein — : 1 




ORF00807 amino acid ABC transporter, permease protein 1 




ORF00809 amino acid ABC transporter, amino acid-binding protein 1 




ORF0081 4 conserved hypothetical protein _| 




ORF00823 bacterial luciferase family protein - 1 




ORF00840 riboflavin biosynthesis protein RibD (ribD) 1 




ORF00841 riboflavin synthase, alpha subunit (nDE) j 




ORF00842 riboflavin biosynthesis protein RibA (ribA) _ — _| 




ORF00843 riboflavin synthase, beta subunit (ribH) 1 




ORF00866 penicillin-binding protein^b , _| 




ORFQ0905 membrane protein, putative 1 



1 



% 



Table 9: GBS gen s shared with pn utnoc ocu 



ORFxxxxxAnn tation 

IORF00910 major facilitator family protein 



1ORF00913 hydrolase, haloacld dehalogenase-like family^ 
|ORF00918 conserved hypothetical protein 



IORF0Q945 conserved hypothetical protein 



1ORF00948 ABC transporter, ATP-binding protein 



ORF0Q952 phosphomethylpyrimldine kinase (thiD) 



1 ORF00953 hydroxyethylthtazole kinase (thlM) 



ORF009S4 thiamine-phosphate pyrophosphorylase (thiE) 



I ORF00961 GtrA family protein 

QRF00967 1 t 4-alpha-glucan branching enzyme (glgB) 
1ORF00968 glucose-1 -phosphate adenylyltransferase (glgC) 
ORF00971 glycogen synthase (glgA) 



|ORF00985 acetyltransferase. GNAT family 



| ORF00990 magnesium transporter, CorA family, putative 
I ORF01 022 nucleoside diphosphate kinase (ndk) 
| oRF01031 nucleoside diphosphate kinase domain protein 



ORF01085 conserved hypothetical protein 
|ORF01087 1S1381, transposase QrfA 



1 ORFQ1088 1S1381, transposase QrfB 



ORF01098 ABC transporter, permease protein, putative 



1ORF01100 sensor histidine kinase 



1ORF01102 ABC transporter, substrate-binding protein 



IORF01127 protease, putative 

I ORF01135 iron compound ABC transporter, permease protein 



1ORF01136 iron compound ABC transporter, permease protein 



ORF01185 aspartate-semialdehyde dehydrogenase (asd) 



IORF01217 conserved hypothetical protein 



1ORF0121 8 conserved hypothetical protein 

IORFQ1219 formate/nitrite transporter family protein — — 

IQRF01226 oxidoreductase, short chain dehydrogenasefreductase family. FRAMESHIFT 

I ORF01254 homoserine kinase (thrB) — 

ORF01255 homoserine dehydrogenase (horn) ' _ 

ORF01264 transcriptional regulator. Cro/Ci fam ily _ 

IORF01268 thiol peroxidase (psaP) 

ORF01305 glycosyltransferase CpsJ(V) (cpsJ) 



IORF01306 glycosyltransferase CpsQ(V) (cpsO) 
ORF01313 CpsP protein (cpsD) 
ORFQ1314 cpsC protein (cpsC) 



1ORF01315 capsular polysaccharide biosynthesis protein CpsB (cpsB) 



ORF01316 capsular polysaccharide biosynthesis protein CpsA (cpsA) 



l ORF01326 conserved hypothetical protein 



ORF01333 alpha-acetolactate decarboxylase (budA) 



1ORF01334 acetolactate synthase, catabollc (ilvK) 



ORF01337 MutT/nudix family protein 
ORF01369 MATE efflux family proteliT 
IORF01398 TnS252, Orf 9 protein "~ 



ORF01399 Tn5252, Orf 10 protein 
IORF01446 protease, putative ""' 



IORF01447 conserved hypothetical protein 



ORF01449 conserved hypothetical protein 



1ORF01492 NAPP-specific glutamate dehydrogenase (gdhA) 



ORF01S69 expressed cell wall surface anchor family protein 



IORF01570 cell wall surface anchor family protein 
ORF01574 polysaccharide biosynthesis protein 



IORF01579 nucleotidyl transferase, putative 
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|ORF01580 polysaccharide biosynthesis protein, putative 
10RFQ1612 conserved hypothetical protein 
|oRF01613 Qlycosyl transferase, group 1 family protein 
1ORF01617 conserved hypothetical protein 



ORF01618 conserved hypothetical protein 



|ORF01621 glvcosyl transferase, putative 



ORF01622 qlycosyl transferase, group 2 family protein 



IORF01623 



transferase, family 8, degenerate 



ORF01624 IS1381, transposase OrfB 



|ORF01625 1S1 381, transposase OrfA 



QRF01626 qlycosyl transferase family 8 



ORF01627 qlycosyl transferase, family 8 



ORF01628 conserved hypothetical protein 



pRF01630 cell wall surface anchor family protein 
ORF01635 protease, putative ~~ 



ORF01643 amlnopeptidase PepS (pepS) 



ORF01702 peptidase. M20/M25/M40 family 



ORF01731 IS1381. transposase OrfA 



1ORF01732 IS1381, transposase OrfB 
ORF01740 tellurite resistance protein TehB (tehB) 



1ORF01747 methylated-PNA-protein-cysteine S-methyltransferase (ogt) 



l oRF01749 acetyltransferase, GNAT family 

loR F01763 AcuB family protein . _ ■ _ v 

1ORF01764 branched-chain amino acid ABC transporter. ATP-binding protein ( ivF) 



IORF01765 branched-chain amino acid ABC transporte r. ATP-binding protein (livG) 
IORF01766 branched-chain amino acid ABC transporter, perme ase protein 



loRFni767 branched- chain amino acid ABC transport er, permease protein (livH) 



ORF01769 branched-chain amino acid ABC transporter , amino acid-binding protein" 
IORF01775 aminotransferase, class I 



ORF01779 potassium uptake protein. Trk family 



ORF01780 cation uptake protein, Trk family 



I ORFOI 824 cobalt transport family protein 



ORF01826 conserved hypothetical protein 



loRF01832 peptidase. M20/M25/M40 family 
loRF01845 conserved hypothetical protein 
ORF01848 transcriptional regulator. MerR family 
ORF01853 isochorismatase family protein 



loRF01859 membrane protein 



ORF01875 oxldoreductase. aldo/keto reductase family 



ioRF01880 phospho-2-dehydro-3-deoxyheptonate aldolase 
|ORF01981 rRN A (guanine-N 1 -)-methvltransf erase, putative 



ORF02083 prophage LambdaSa2. DNA replication p rotein DnaC. putative 



ORF02101 Na+/H+ exchanger family protein 
I ORF021Q7 membrane protein, putative 
IORF02139 UDP-glucose 4-epimerase (galE) 
IORF02143 lacX protein 



ORF02162 conserved hypothetical protein 



loRF02186 hemolysin precursor, putative 



1ORF02192 transcriptional regulator. MerR family 



ORF02195 MutT/nudix family protein 



ORF02228 IS1381. transposase OrfB 
ORF02229 1S1381, transposase OrfA 



ORF02233 conserved hypothetical protein 



ORF02234 conserved hypothetical protein - A ,. c . 

IORF02276 5-methyltetrahydropteroyltriglute (metfc) 
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ORF02278 branched-chain amino acid transport protein AzlC. putative 



ORF02288 qtycosyl transferas , family 8 



ORF02289 qlycosyl transferase, family 8 



ORF02341 ribosomal protein L32 (rpmF) 



ORFQ2343 conserved hypothetical protein 



ORF02358 sensor histidine kinase 



ORF02369 conserved hypothetical protein 



ORF02384 LysM domain protein _ 
ORF02428 hypoxanthine-guanine phosphoribosvltransferase (hpt) 



ORF03011 ribosomal protein L33 



ORF03014 ribosomal protein L33 
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ORF00064 ribosomal protein S14, putative 

ORF00095 D-alanyl-D-alanine carboxypeptidase fa mily protein 

ORF00096 N-acetylmuramovl-L-alanine amidase. f amily 4 protein 

ORF00110 conserved hypothetical protein t 

ORF00112 DNA repair protein RadA (radA) , , 

ORF00124 permease, putative 

ORF00148 glycosyl transferase, group 4 family protein m 

ORF00154 penicillin-binding protein 4, putative _ . 

ORF00157 oligopeptide ABC transporter, per mease protein 

ORF002Q6 oligopeptide ABC transporter, oligopep tide-binding protein 

ORF00207 oligopeptide ABC transporter, permease protein 

ORF00208 oligopeptide ABC transporter, permease protein 

ORF00209 peptide ABC transporter. ATP-binding protein 

ORF00210 peptide ABC transporter. ATP-binding protein m 

ORF00216 IS1548. transposase _ 

ORF00226 conserved hypothetical protein 

ORF00232 conserved hypothetical protein _ 

ORF00239 site-specific recombinase, phage integrase family 

ORF00250 conserved hypothetical protein 

ORF00251 conserved hypothetical protein 

ORF00289 ABC transporter. ATP-binding protein 

ORF00305 NADH oxidase, putative 

ORF00317 cell division protein FtsL. putative 

I" " 10333 conserved hypothetical protein 
)0383 hydrolase, haloadd dehalogenase-like family , 
30430 expressed putative lipoprotein 
30431 transcriptional repressor CopY _ 
30434 membrane protein, putative . 
DQ438 transcriptional regulator. Fur family 
00442 membrane protein, putative . 
00445 bloY family protein 
00446 AtsA/EtaC family protein 
00468 expressed putative protease 
00469 glycosyl transferase, group 2 family protein 
00471 nrdl protein (nrdl) _ 
00473 expressed protein of unknown function m 
00474 conserved hypothetical protein m _ 
r 00507 conserved hypothetical protein 
: 00525 bioY family protein — 
: QQ528 thiolase _ _ — 
-00531 AMP-binding enzvme domain protein m 
: 00548 YGGT family protein . — 
-00565 exodeoxyribonuclease VII. small subunlt (xs eB) 
r 00568 arginine repressor ArgR. putative _ 
r 00572 expressed putative lipase/acylhydrolase 
F00573 conserved hypothetical protein 
F00586 iron-sulfur cluster-binding protein, putative ,, 
FQ0592 oxidoreductase. short chain dehydrogenase/reductase family 
F00804 dipeptidase _ 
F00611 voltage-gated chloride channel family protein 
F00619 prophage LambdaSal, repressor protein, putative 
F00622 conserved hypothetical protein 
F00627 prophage LambdaSal, antirepressor, putative 
F00634 conserved hypothetical protein 
F00648 conserved hypothetical protein 
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tORF00654 conserved hypothetical prot in 



jORF00655 conserved hypothetical protein 



ORF0Q656 conserved hypothetical protein 



ORF00658 conserved hypothetical protein 



ORF00659 conserved hypothetical protein 



1ORF00660 prophage LambdaSa1 t structural protein, putative 



ORF00662 conserved hypothetical protein 
ORF00663 conserved hypothetical protein 



QRF00664 conserved hypothetical protein 



ORF0Q665 conserved hypothetical protein 



I0RFOO666 prophage LambdaSal t structural protein 

1ORF00668 conserved hypothetical protein 
[QRF00669 prophage LambdaSal, pblA protein, internal deletion 
ORF00677 prophage LambdaSal, lysin. putati ve 



IORF00679 conserved hypothetical protein 



IORF00695 transposase QrfB, IS3 family, truncation 
l oRF0Q697 conserved hypothetical protein 



lORF00707 conserved domain protein 



lORF0Q713 acid phosphatase precursor, class B 
loRF00720 transposase QrfB, IS3 family FRAMESHIFT 
| ORF00721 transposase OrfA, IS3 family 
ORF00751 cvlA protein (cylA) 



IORF00755 evil protein (cyU) 
PRF00760 serine protease, subtllase family. putativ ePOlNT MUTATION 
lORF00781 transcriptional regulator, LysR family 



ORF00783 regulatory protein, putative 



IORF00785 IS1S48, transposase 



IORF00786 regulatory protein, putative, truncation 
IORF00787 P-lactate dehydrogenase (ldhA) "~ 



jORF00801 glycosyl transferase, group 1 famiiy protein 
IORF00805 conserved hypothetical protein 



^(^00826 phage shock protein C, putative 

ORF00833 conserved hypothetical protein 
loRF00845 hydrolase, haloadd dehalogenase-like family 
I ORF00852 conserved hypothetical protein 



jORF00853 expressed putative lipoprotein 



IORF00857 IS1548, fransposase 



pRF00890 conserved hypothetical protein 
1ORF00902 conserved hypothetical protein 



|ORF00926 membrane protein, putative 



|ORF00927 membrane protein, putative 



QRF00987 conserved hypothetical protein 



ORF01009 expressed protein of unknown function 



ORF01010 lipoyl-binding domain protein 



1ORF01011 oxidoreductase, putative 
1ORF01012 conserved hypothetical protein 
[ORF01024 expressed putative lipoprotein 



QRF01061 signal peptidase l, putative 



ORF01064 IS1548, transposase 
ORF01084 glyoxylase family protein" 
lORF01104SatD 



ORF01126 conserved hypothetical protein 



loRF01191 conserved hypothetical protein 



pRF01 1 92 conserved hypothetical protein 

[ORF01193 glycine cleavage system H protein, putative 
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IORF01 1 94 bacterial luctferas family protein 



IORF01195 oxidoreductase, FMN-binding 



ORF01 197 lipoate-protein llgase A family protein 
1 ORF01202 1S861 . transposase QrfA 



ORF01223 drug resistance transporter, EmrB/QacA family, putative 



|ORF01224 conserved hypothetical protein 



.ORF01225 potassium uptake protein, putative 
loRF01237 membrane protein, putative 



ORF01249 dihydroneopterin aldolase (folB) 

ORF01256 polysaccharide deacetylase family protein 



ORF01273 transcriptional regulator, GntR femiiy/potassioum uptake protein, TrkA family 



|ORF01280 conserved hypothetical protein 



ORF01281 conserved hypothetical protein 



l oRF01289 lipoprotein, putative 
1 ORF01291 conserved hypothetical protein 
1ORF01298 conserved hypothetical protein 



ORF01318 conserved hypothetical protein 



ORF01320 voltage-gated chloride channel family protein, putative 



|ORF01322 arsenate reductase (arsC) 



QRF01340 dTDP-glucose 4.6-dehydratase (rfbB) 



ORF01341 dTDP-4-dehydrorhamnose 3,5-epimerase 



ORF01342 glucose-1-phosphate thymidylyltransferase (rtbA) 



loRF01356 hypothetical protein 



ORF01368 conserved hypothetical protein 



IORF01374 ISSdyl. transposase OrfB 



ORF01388 transposase QrfA. IS3 family 



ORF01389 transposase OrfB, IS3 family, truncation 



ORF01391 ISSdyl, transposase OrfB FRAMESHIFT 



1ORF01396 transcriptional regulator, Cro/CI family 



|ORF01419 repressor protein, putative 



IORF01461 amino acid permease 



ORF01469 conserved hypothetical protein 



|ORF01483 sensor histtdine kinase 



iORF01485 GTP pyrophosphokinase family protein 



loRF01490 S'-nucleotidase family protein 
ORF01509 2-dehydropantoate 2-reductase, putative 



IORF01510 regulatory protein, putative 



ORF01S22 carbamoyl-phosphate synthase, large subunit, putative 



IORF01542 suifatase 



ORF01549 conserved hypothetical protein 



ORF01554 iron compound ABC transporter, substrate-binding protein 



pRF01557 conserved hypothetical protein 



ORF01563 conserved hypothetical protein TIGR01212 
loRF01583 glycosyltransferase, group 2 family protein 



ORF01584 glycosyltransferase, group 2 family protein 



ORF01585 glycosyltransferase, putative 



ORF01586 dTPP-4-dehydrorhamnose reductase (rfbD) 



IORF01593 conserved hypothetical protein 



ORFQ1599 conserved hypothetical protein 



ORF01600 glycerot-3-phosphate transporter, putative 



ORFQ1639 conserved hypothetical protein 



[ORF01650 nitroreductase family protein 



ORF01653 amino acid permease 



ORF01665 transcriptional regulator. MutR family 



|pRF01683 MutT/nudix family protein 
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1QRFQ1686 67 kDa Myosin-crossreactive streptococcal antigen, 
ORF01688 peptid methionine sulfoxide reductase (msrA) 
{ORF01694 peptide ABC transporter, permease protein 



| oRF01 704 conserved hypothetical protein 
(ORF01705 IS861, transposase OrfA ~ 



1ORF01741 membrane protein, putative 



ORF01770 conserved hypothetical protein 
ORFQ1772 1S1548, transposase ^ 



{ORF01790 conserved hypothetical protein 



1ORF01794 conserved hypothetical protein 
1ORF01800 amino acid ABC transporter, substrate-binding protein 
ORF01 810 IS1548, transposase — - 



I ORF01827 sodium:dicarboxylate symporter family protein 
ORF01877 immunogenic secreted protein, putative 
ORF01 91 3 transcriptional regulator, Cro/CI family 
1ORF01928 membrane protein, putative 



{ORF01931 transporter, putative 
ORF01 932 transcriptional regulator, Crp/Fnr family 



1ORF01947 transcriptional regulator, merR family 



1QRF01970 acid phosphatase 



ORF02Q02 amino acid ABC transporter, permease protein 



ORF02028 perfringolysin Q regulator protein (pfoR) 



IORF02029 conserved hypothetical protein 



1ORF02031 expressed protein of unknown function 



pRF02032 expressed protein of unknown function 



loRF02035 deoxynucleoside kinase family protein 



pRF02042 alkyl hydroperoxide reductase, subunit C (ahpC) 



1ORF02126 transcriptional regulator, MarR family 
QRF02128 N-acetylmuramoyi-L-alanine amldase, family 4 protein 
IORF021 35 malate oxtdoreductase 



ORF02136 citrate carrier protein, CCS family 



ORF02137 sensor histidine kinase family protein 



1ORF02138 response regulator 



ORF02166 conserved hypothetical protein 



|ORF02169 PTS system, HB component 



ORF02170 PTS system. HA component, putative 



IORF02202 ABC transporter, ATP-binding protein 



ORF02262 ABC transporter, ATP^binding protein 



ORF02270 cAMP factor (cfb) 



[ORF02280 serine protease, subtilase family, putative 



ORF02286 major facilitator family protein 



IORF02292 preprotein translocase, SecE subunit, putative 
IORF02295 Lyme disease proteins of unknown function, putative 



ORF02298 Na+ dependent nucleoside transporter 



|ORF02301 transcriptional regulator, GntR famiiy 



ORF02313 virulence factor MviM, putative 



1 ORF02316 membrane protein, putative 



ORF02319 conserved hypothetical protein T1GR00250 



IORF02328 transporter, putative 
"ORF02331 cold shock protein. CSD family 



ORF02332 DNA mismatch repair protein HexA (hexA) 



1ORF02335 conserved hypothetical protein 



ORF02372 conserved hypothetical protein 



ORF02383 expressed putative lipoprotein 



IORF02393 transporter, putative 
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ORF02398 transcriptional regulator, Crp/Fnr family 

ORF02399 conserved hypothetical protein 

ORF02401 acetyltransferase. GNAT family 

ORF02403 arglnlne/ornithine antiporter (arcD) 

ORFQ30Q2 conserved hypothetical protein, truncation 
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1ORF00008 protease, puiative. 



1ORF00010 acyl earner protein (acpP) 

loRFOOOl 6 acetvltransferase, GNAT famni 

ORF00018 peptidase. M23/M37 family, pu tative secreted protein 



1ORF00035 membrane protein, putative 
|ORF00087 lipoprotein, putative^ 



IORFOOO88 hypothetical protein 



ORF00089 hypothetical protein 



ORF00091 conserved hypothetical protein 



ORFQQ117 ribose ABC transporter, periplasms D-ribos e-blndlng protein (rbsB) 
ORF0Q118 ribose ABC transporter, permease protein (rbsC) 



16RF0Q12Q ribose ABC transporter protein RbsD (rbsD) 



ORF00121 ribokinasfe (rbsK) 

pRF00123 hypothetical protein 

1ORF00130 argininosuccinate lyase (argH) 



JORF00137 conserved hypothetical protein 

l oRFOOl 38 hypothetical pr otein — 
ORF00166 4<llphosphocytidyl>2C>methvl-[>erythrito l kinase (ispE) 

IORF00182 conserved domain protein 

I0RFOOI86 transcriptional regulator. Cro/CI family . 



ORF00187 hypothetical protein 



ORF00188 hypothetical protein 



IORFQ0192 hypothetical protein 

IORF00193 conserved hypothetical protein 
loRFOOl 96 conserved hypothetical protein 
ORF00199 hydrolase, haloacid dehalogenase-like family 



1ORF00200 sensor hlstidine kinase, putative 
loRF00201 response regulator 



PRF002Q3 conserved hypothetical protein 



1ORF00204 membrane protein, putative 



1ORF00205 hypothetical protein 
I ORF00228 lipoprotein, putative 
loRF00234 hypothetical protein 
loRF00235 hypothetical protein 



1ORF00238 hypothetical protein . 

l oRF00240 transcriptional regulator. Cro/CI family 
j oRF00241 hypothetical protein 



1QRF00242 conserved hypothetical protein 



|ORF00243 hypothetical protein 



ORF00244 conserved domain protein 



ORF00245 conserved hypothetical protein, fusion 



l oRF00246 replication initiation protein, putative^ 
IORF00247 hypothetical protein 



ORF00248 recombination protein 



ORF0Q249 hypothetical protein 



1ORF00252 conserved hypothetical protein 



ORF00253 hypothetical protein. 



IORF00254 hypothetical protein 



;ORF00255 hypothetical protein 



ORF00256 hypothetical protein 



ORF00257 hypothetical protein 



[ORF00258 hypothetical protein 



iQRF00259 hypothetical protein 



ORF00260 hypothetical protein 



[ORF00272 expressed putative lipoprotein 



1 



Table 11: GBSg n s not shared with GAS or pn um 



"ft 

I c&cc 



cus 



IQRFxxxxx Annotation 



ORF00273 hypothetical protein 



ORF00274 hypothetical protein 



ORF00275 hypothetical protein 



|ORF0Q276 hypothetical protein 



|ORF00278 membrane protein, putative 



ORF00279 transcriptional regulator, Cro/CI family 



ORF002B0 acetyltransferase, GNAT family 
ORF00281 acetvltransferase, GNAT family 
|ORF00283 conserved hypothetical protein 



ORF00284 RNA polymerase sigma factor, ECF subfamily 



1ORF00285 lipoprotein, putative 



lORF00287 transcriptional regulator, TetR family 

ORF00288 ABC transporter efflux protein, DrrB family, putative 



ORF00292 hypothetical protein 



ORF00294 expressed protein of unknown function 



ORF00298 acyi carrier protein phosphodiesterase, putative 



pRF00308 conserved hypothetical protein 



ORF00324 conserved hypothetical protein 



| ORF00332 hypothetical protein 



pRF00340 hypothetical protein 



ORF00347 conserved hypothetical protein 
ORF00384 hypothetical protein 
pRF00402 membrane protein, putative 



[ORF00408 hypothetical protein 



1ORF00409 membrane protein, putative 



PRF00414 conserved hypothetical protein 



1 ORF00416 hypothetical protein 



ORF00417 hypothetical protein 



IORF00433 copper-transporter protein CopZ 
IORF00448 hypothetical protein "~ 



ORF00466 conserved hypothetical protein 



ORF00467 acetvltransferase, GNAT family 



ORF00475 conserved domain protein 

IORF00476 hypothetical protein 

|ORF00478 carboxymuconolactone decarboxylase family protein 



QRF00479 conserved hypothetical protein 



QRF00486 transcriptional regulator. AraC family 



ORF00487 surface protein Rib 
ORF00488 transposase, IS2S6 family, truncation 



[ORF00489 DNA-damage-inducible protein J, putative 



IORF00490 hypothetical protein 
1QRF00491 lipoprotein, putative 



ORF00493 bacteriophage L54a, integrase, truncation 
ORF00497 conserved domain protein "~ 



loRF005Q3 oxidoreductase, Gfo/ldh/MocA family 



IQRF00506 transposase, IS256 family 



ORF00510 bacterlocin transport accessory protein.putative 
ORF0051 2 hypothetical protein ~~ 



IORF00526 biotin synthetase (bioB) 
pRF00527 hypothetical protein 



pRF00533 type IV prepilin peptldase-related protein 



QRF00538 conserved hypothefical protein 
lORFOOSSS hypothetical protein 



ORF00563 expressed protein of unknown function 
|ORF0Q575 hypothetical protein 
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loRF00S84 conserved hypothetical protein 



pRF00585 fructose-1,6-bisphosphatase, putativ 



IORF00590 carboxymethylenebutenolidase-related protein 



loRF00597 conserved hypothetical protein 



ORF00598 inosine-uridine preferring nucleoside hydrolase 
IORF00599 hypothetical protein ~ 



ORF0Q6Q0 OsmC/Ohr family protein _ 

pRF00608 adenosine deaminase, putative 

|ORF00610 chorismate mutase, putative 

ORF00615 prophage LambdaSal, site-specific recombinase, phage integrase family 
loRF00617 conserved domain protein 

ORF00618 hypothetical protein 

|ORF00620 hypothetical protein 

ORF00621 conserved hypothetical protein 

1 ORF00623 hypothetical protein 

|ORF00624 hypothetical protein 

ORF00626 prophage LambdaSal, transcriptional regulator, Cro/CI family 

ORF00628 hypothetical protein 

pRF00630 hypothetical protein 

|ORFQ0632 hypothetical protein 

1 ORF00633 conserved hypothetical protein 

loRFQ0635 hypothetical protein 

IORF00636 hypothetical protein 



1ORF00637 hypothetical protein 



ORF00638 conserved hypothetical protein 



IORF00639 conserved domain protein 

1ORFQ0641 prophage LambdaSal, reverse transcriptase/maturase family protein 



IORF00642 conserved hypothetical protein 



ORF00643 conserved hypothetical protein 
IORF00644 hypothetical protein " 



|ORF00645 hypothetical protein 



ORF00646 conserved hypothetical protein 



IORF00647 hypothetical protein 



pRF00649 hypothetical protein 



pRF00650 hypothetical protein 
ORF00652 conserved hypothetical protein 



IORF00653 conserved hypothetical protein 



ORF00657 conserved hypothetical protein, truncation 



IORF00661 conserved hypothetical protein 



IORF00667 conserved hypothetical protein 

'oRF00670 prophage LambdaSal. minor structural protein, putative 



JORF00671 prophage LambdaSal, N-acetylmuramoyl-L-alanine amldas e, family 4 

1ORF00672 prophage LambdaSal. minor structural protein, putative 

ORF00673 hypothetical protein ■ 

|ORF00674 hypothetical protein " 1 

ORF00675 conserved hypothetical protein 

QRF00676 conserved hypothetical protein 

ORF00678 conserved hypothetical protein - 

ORF00681 conserved hypothetical protein _ — - 

1ORF00682 hypothetical protein — — — 

ORF00683 prophage LambdaSal. site-specific recombinase, phage integrase family FRAMESHIFT 

ORF00685 conserved hypothetical protein . 

IORF00689 conserved hypothetical protein, FRAMESHIFT 



IORF00698 hypothetical protein 



ORF00703 phosphoserine phosphatase SerB (serB) 



3 



% 



Table 11: GBS g n 



s n t shared with GAS or pn um 



us 



IQRFxxxxx Annotation 



lORF00704 MutT/nudix family protein 



ORF00712 hypothetical protein 



ORF00718 cell wall surface protein, inteiTuption-N 



1ORF00723 hypothetical protein 



ORF00726 transcriptional regulator, AraC family 



ORF0Q727 expressed cell wall surface anchor family protein 



1ORF00728 expressed cell wall surface anchor family protein 
|ORF00735 expressed protein of unknown function 
1ORF00737 conserved hypothetical protein, degenerate 
ORF00738 hypothetical protein 



pRFQ0740 hypothetical protein 



1ORFQ0741 hypothetical protein 



ORF00742 lipoprotein, putative 



IORF00747 cylD protein (cylD) 



ORF00749 acyl carrier protein AcpC 



QRF00750 cylZ protein FRAMESHIFT 



I ORF00752 cylB protein (cylB) 



ORF00753 cylE protein (cylE) 



1ORF00754 cylF protein (cylF) 



I ORF00756 cylJ protein (cylJ) 



I ORF007S7 cvIK protein (cylK) 
l oRF00758 hypothetical protein 



ORF00759 putative secreted protein 
ORF00761 hypothetical protein 



pRF00766 expressed putative secreted protein 



|ORF00767 hypothetical protein 



ORF00768 conserved domain protein 



1 ORF00769 permease, putative 



PRF00775 conserved hypothetical protein 



IORF00777 DedA family protein, putative 



1ORF00779 membrane protein, putative 



ORF00788 sodium:galactoside symporter family protein, putative 
[ORF00791 transcriptional regulator, GntR family 



lORF00793 Glucuronate isomerase (uxaC) 



ORF00794 mannonate dehydratase (uxuA) 



1ORF00795 D-mannonate oxidoreductase 



ORF00796 hydrolase, haloacid dehalogenase-like family 



IORF00797 glycosyl hydrolase, family I 



ORF00806 conserved hypothetical protein 



loRF00822 ABC transporter, ATP-binding protein 
IORF00827 hypothetical protein 



ORF00834 conserved hypothetical protein 



|ORF00838 membrane protein, putative 



ORF00839 Mn2^/Fe2^ transporter, NRAMP family 



IORF00848 conserved domain protein 



QRF00872 cell wall surface anchor family protein 



IORF00874 conserved hypothetical protein 



ORF00878 ABC transporter, permease protein 



ORF00879 YaeC family protein, putative 
ORF00888 hydrolase, haloacid dehalogenase-like family 



IORF00891 conserved domain protein 



ORF00898 conserved hypothetical protein 



loRF00900 permease, GntP family 



|ORF00903 transcriptional regulator, MarR family 



|ORF00907 glutathion &-transferase family protein 
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|ORF00909 hypothetical protein 



ORF00921 membrane protein, putative 



1QRF00922 glycosyl transferase, family 8 



1ORF00923 hypothetical protein 



ORF00924 conserved hypothetical protein 



ORF00939 conserved hypothetical protein 



ORF00942 expressed putative secreted protein 



1ORF00943 hypothetical protein 



QRF00944 hypothetical protein 



I ORF00946 conserved hypothetical protein 
|ORF00950 hypothetical protein 
ORF00951 transcriptional regulator, TenA family 



l oRF00972 ATP synthase FO. C subunlt (atpE) 
pRF00980 conserved hypothetical protein 



| ORF00982 conserved hypothetical protein 



I ORF01003 conserved hypothetical protein 
1 ORF01004 conserved hypothetical protein 



IORFQ1013 hypothetical protein 
|oRF01014 hypothetical protein 



ORF01Q15 hypothetical protein 



1 ORFQ1016 hypothetical protein 



ORF01018 hypothetical protein 



ORF01019 hypothetical protein 



IORF01021 hypothetical protein 
loRF01025 HP domain protein 



1 ORF01026 acetyltransferase, GNAT family 



ORF01032 chloramphenicol acetyltransfefase (cat) 



pRF01034 Tn916, transposase 



|ORF01035 Tn916, excisionase 



lORF01037Tn916, hypothetical protein 



QRF01038Tn916, hypothetical protein 



|ORF01039 Tn916, transcriptional regulator, putative 



|ORF01041 Tn916, hypothetical protein 



|ORF01042 Tn916, NLP/P60 family protein 



ORF01Q44 membrane protein, putative FRAMESHIFT 



|ORF01048Tn916, hypothetical protein 



ORF01049 Tn916, hypothetical protein 



lORFOIOSOTngiS, hypothetical protein 

1ORF01051 Tn916, transcriptional regulator, putative 
loRF01052 Tn916, FtsK/SpolllE family protein ~ 



1ORF01QS3 Tn916, hypothetical protein 
IORF01054 Tn916, hypothetical protein 
loRF01062 hypothetical protein 



ORFQ1086 Na^/H-*- exchanger family protein 



l oRF01092 acetyltransferase. GNAT family 



QRF01Q96 nisin-resistance protein, putative 



IORF01103 conserved hypothetical protein 



IQRF01 124 acetyltransferase, GNAT family 



1ORF01133 iron-compound ABC transporter, tron-compound-binding protein 
1ORF01 140 conserved hypothetical protein 
]ORF01 142 carbon stan^ation protein CstA, putative 
1ORF01 143 response regulator 



ORF01144 sensor histidine kinase, putative 



IORF01145 lipoprotein, putative 



IORF01146 conserved hypothetical protein, FRAMESHIFT 
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ORF01 148 lipoprotein, putative 

ORF01149 hypothetical protein 

pRF01150 hypotheticai protein 

ORF01 1 51 hypothetical protein 

ORF01 1 52 lipoprotein, putative 

ORF01 1 53 hypothetical protein 

ORF01157 conserved hypothetical protein 

ORF01158 hypothetical protein 
1ORF01159 hypothetical protein 

1ORF01160 expressed protein of unknown function FRAMESHIFT 
loRF01161 expressed conserved domain protein 
IORF01 162 conserved hypothetical protein 
ORF01164 FtsK/SpolllE family protein FRAMESHIFT 
ORF01166 hypothetical protein 
loRFO1 167 conserved hypothetical protein 
1ORF01 168 conserved hypothetical protein 
l oRFO1 169 hypothetical protein 
ORF01172 phage infection protein, putative 
[ORFO1 173 conserved hypothetical protein 
loRFOI 174 conserved domain protein 
I ORFO1 175 hypothetical protein 
IORFO1 1 82 membrane protein, putative 
l oRF01186 cell wail surface anchor family protein, putative 
l oRFOI 187 hypothetical protein 
IORF01204 hypothetical protein 
IORF01 21 5 hypothetical protein 

IORF01241 transcriptional regulator, AraC family, putative 

■ 1ORF01253 rarP protein (rarP) 

| ORF01257 transporter, BCCT family protein 

1ORF01258 hypothetical protein 

1ORF01261 expressed protein of unknown function 

ORF01262 conserved hypothetical protein, FRAMESHIFT 
l ORF01 263 hypothetical protein 

ORF01265 hypothetical protein 

ORF01266 hypothetical protein 

ORF01269 conserved hypothetical protein 

ORF01272 conserved hypothetical protein 

ORF01277 conserved hypothetical protein 
I ORF01287 conserved hypothetical protein 
I 0RFOI288 membrane protein, putative 

IORF01299 CMP-N-acetylneuraminic acid synthetase NeuA (neuA) 
|oRF01 300 neuP protein (neuP) 

|oRF01301 UPP>N-acetylglucosamine-2>epimerase NeuC (neuC) 
1ORF01302 N-acetyl neuramic add synthetase NeuB (neuB) 
|ORF01303 polysaccharide biosynthesis protein CpsL (cpsL) 
1 ORF01304 polysaccharide biosynthesis protein CpsK(V) (cpsK) 

ORF01307 glycosyltransferase CpsN(V) (cpsN) 
|ORF01308 polysaccharide biosynthesis protein CpsM(V) (cpsM) 
I ORF01309 polysaccharide biosynthesis protein cpsH(V) (cpsH) 
l oRF01310 glycosyltransferase CpsG(V) (cpsG) 

ORF01311 polysaccharide biosynthesis protein CpsF (cpsF) 
IORF01 31 2 glycosyltransferase CpsE (cpsE) 
IORF01348 conserved domain protein 
|oRF01349 hypothetical protein 
IORF01 370 conserved hypothetical protein 
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ORF01371 conserved hypothetical protein 

JORF01372 expressed protein of unknown function 
IORF01373 ISSdyl, transposase QrfA ~ 



ORF01375 conserved hypothetical protein 



ORF01379 transposase OrfB, 1S3 family, truncation 



jORFQ1382 GBSil, group il intron, maturase 



|ORF01384 hypothetical protein 



[QRF01385 hypothetical protein 



QRF01386 conserved hypothetical protein 



JORFQ1387 conserved hypothetical protein, truncation 
IORF01390 ISSdyl, transposase QrfA FRAMESHIFT 
ORF01392 hypothetical protein 



QRF01393 hypothetical protein 



|ORF01394 site-specific recombinase, phage integrase family 
ORF01 395 conserved hypothetical protein 



ORF01401 transposase. 1SL3 family 



ORF01404 mercuric resistance operon regulatory protein MerR (merR) 



jORF01408 cadmium efflux system accessory protein (CadC) 



1ORF01409 conserved hypothetical protein 



}ORF01410 hypothetical protein 



1ORF01417 hypothetical protein 



{ORF01418 hypothetical protein 



ORF01420 hypothetical protein 



|ORF01421 ImpB/MucB/SamB family protein 



ORF01423 conserved hypothetical protein 



ORF01424 conserved hypothetical protein 



|ORF01425 conserved hypothetical protein 



1ORF01426 conserved hypothetical protein 
ORF01427 hypothetical protein 
ORF01428 conserved hypothetical protein 



ORF01430 hypothetical protein 
ORF01431 hypothetical protein 



ORF01432 conserved domain protein 



ORF01433 SNF2 family protein 
IORF01434 hypothetical protein 



ORF01435 calciunvbinding protein, putative 
ORF01436 agglutinin receptor (ssp-5) 



loRF01437 abortive infection protein AbiGI (abiGl) 



ORF01438 abortive infection protein AbiGH (abiGII) 



IORF01439 conserved hypothetical protein 



ORF01440 expressed protein of unknown function 
IORF01441 conserved hypothetical protein, degenerate 
ORF01442 membrane protein, putative 
I ORF01443 hypothetical protein 



iORF01444 Tn5252, Orf 21 protein, internal deletion 



ORF01445 hypothetical protein 



|ORF01450 conserved hypothetical protein 



1ORF01452 hypothetical protein 



ORF01454 conserved hypothetical protein 



IORF01459 hypothetical protein 



pRF01460 homocysteine S-methyltransferase MmuM, putative 
ORF01463 hypothetical protein 



ORF01464 hypothetical protein 



|ORF01465 hypothetical protein 



1ORF01466 transcriptional regulator, TetR family 
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ORF01477 glutathione S-transferase family protein, putative 



1ORF01478 conserved domain protein 



IORF01486 hypothetical protein 



ORF01488 R5 protein 

ORF01489 transcriptional regulator, MarR family, putative 
|ORF01494 membrane protein, putative ~ 



I ORF01497 acetyltransferase, GNAT family 
|ORF01502 hypothetical protein " 



ORF01503 conserved hypothetical protein 



ORF01508 surface antigen-related protein 



|ORF01535 conserved hypothetical protein 



JoRFOI 547 conserved hypothetical protein 
1ORF01566 expressed cell wall surface anchor family protein 
ORF01572 glycosyltransferase, group 1 family protein _ 



ORF01573 glycosyltransferase, group 2 family protein 



IORF01575 membrane protein, putative 



ORF01576 glycosyltransferase, group 2 family protein 
ORF01577 glycosyltransferase, group 2 family protein 



ORF01578 nucleotide sugar dehydratase, putativ 
loRF01581 lipoprotein, putative 



1ORF01582 consented hypothetical protein 



IORF01596 ammonium transporter family protein 



iORF01597 conserved hypothetical protein 



1ORF01601 hypothetical protein 

|ORF01608 proton/peptide symporter family protein 
[oRF01611 hypothetical protein 
IORF0161 5 conserved domain protein 



iORF01638 conserved hypothetical protein 



ORF01641 conserved hypothetical protein 



IORF01645 cell wall surface anchor family protein 
I 0RFOI66O membrane protein, putative 



ORF01661 ABC transporter, ATP binding protein 



ORF01666 hypothetical protein 



ORF01667 hypothetical protein 
ORF01670 hypothetical protein 



ORF01672 protease, putative, POINT MUTATION 



1ORF01673 hypothetical protein 



lORF01674 hypothetical protein 



ORF01675 hypothetical protein 



ORF0168Q tetracenomycin polyketide synthesis O-methvltransferase TcmP, putative 
ORF01681 hypothetical protein 



I ORF01682 hypothetical protein 
ORF01684 hypothetical protein 



ORF01692 peptide ABC transporter, ATP-binding protein 



1ORF01695 peptide ABC transporter, permease protein 
IORF01696 peptide ABC transporter, peptide-binding protein" 
ORF01699 transposase, IS30 family, putative 
loRF01700 transporter, major facilitator family 



ORF01703 transcriptional regulator, LysR family 



ORF01715 conserved hypothetical protein 



|ORF01719 hypothetical protein 



pRF01720 conserved hypothetical protein 



ORF01721 glyoxalase family protein 



ORF01727 conserved hypothetical protein 



|ORF01729 acetyltransferase, GNAT family 
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ORF01730 qlycosyl transferase, group 2 family protein 

ORF01733 hypothetical protein 

ORF01734 conserved hypothetical protein 

ORF01735 hypotheticai protein 

ORF01736 hypothetical protein . 

ORF01737 hypothetical protein 

ORF01742 hypothetical protein ' 

I ORF01 743 PTS system component, putative 
IORF01744 conserved hypothetical protein 

|oRF01748 D-isomer specific 2-hydroxyacfd dehydrogenase family protein 
I ORF01 753 conserved hypothetical protein 

ORF01754 hypothetical protein 

ORF01761 transposase. IS30 family, putative, truncation 
ORF01778 amino acid permease, putative 
loRF01807 hypothetical protein 

ORF01836 hypothetical protein 

ORF01838 hypothetical protein 

1ORF01839 dihydroxyacetone kinase family protein 

ORF01 840 transcriptional regulator, TetR family, putative 

1ORF01842 hypothetical protein 

loRF01843 dihydroxyacetone kinase fiamiiy protein 

ORF01844 dihydroxyacetone kinase family protein 
IORF01847 conserved hypothetical protein 

| QRF01850 hypothetical protein 

ORF01863 pyruvate phosphate dikinase (ppdK) 

|ORF01864 depressed protein of unknown function 

1ORF01865 CBS domain protein 

iORF01866 3-hydroxyacyl-CoA dehydrogenase family protein, putative secreted protein 
|ORF01892 hypothetical protein 

IORF01893 hypothetical protein 

IORF01894 conserved hypothetical protein 

|ORF01895 hypothetical protein 

iORF01896 hypothetical protein 

1ORF01897 hypothetical protein ; 

QRF01898 hypothetical protein 

ORF01899 hypothetical protein 

ORF019Q3 conserved hypothetical protein 

IORF01904 drug resistance transporter, EmrB/QacA family 

ORF01905 hypothetical protein 

I ORF01922 conserved hypothetical protein 

ORF01925 FMN-binding protein 

| ORF01 934 hypothetical protein 

I ORF01 936 poly prenyl synthetase family protein 
ORFQ1939 cytochrome d ubiquinol oxidase, subunit ll (cydB) 
loRF01940 cytochrome d oxidase, subunit 1 (cydA) 
1ORF01941 pyridine nucleotide-disulphide oxidoreductase family protein 
ORF01942 prenyltransferase, UblA family 
I ORF01 943 hypothetical protein 
ORF01944 hypothetical protein 

[ORF01946 cyclopropane-fatty-acyt-phospholipid synthase (cfa) 
IORF01951 conserved hypothetical protein 
|ORF01953 hypothetical protein 
ORF01954 conserved hypothetical protein 
[ORF01984 hypothetical protein 
(ORF01988 hypothetical protein 
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ORF01989 hypothetical protein 



ORF01990 hypothetical protein 



ORF01991 hypothetical protein 



ORFQ2000 membrane protein, putative 



ORF02001 transposase, IS30 family, putative 



ORF02005 hypothetical protein 



ORF02006 xylulose-5-phosphate/fmctose^phosphate phosphoketolase (xfp) 



ORF020Q9 conserved hypothetical protein 



QRF02010 carbohydrate kinase, FGGY family 



ORF02011 hypothetical protein 



ORF02012 PTS system component putative 



ORF02015glyoxylate reductase, NAPH-dependent 



ORF02016 hypothetical protein 



ORF02025 hypothetical protein 



ORF02026 hypothetical protein 



QRF02030 glutamate-cysteine ligase-related protein 



ORF02036 phosphlnothricin N-acetyltransferase (pat) 



ORF02039 conserved hypothetical protein 



ORF02044 conserved hypothetical protein 



ORF02045 conserved hypothetical protein 



ORF02046 prophage LambdaSa2 t lysin, putative 



ORF02047 prophage LambdaSa2, holin, putative 



ORF02048 conserved hypothetical protein 



ORF02049 hypothetical protein 



ORF02050 conserved domain protein 



ORF02051 prophage LambdaSa2, PbIB, putative 



QRF02053 conserved hypothetical protein 



ORF02056 conserved hypothetical protein 



ORF02057 hypothetical protein 



L ORF02058 hypothetical protein 



ORF02059 conserved hypothetical protein 



ORF02060 conserved hypothetical protein 



ORF02061 hypothetical protein 



ORF02082 hypothetical protein 



ORF02063 conserved domain protein 



ORF02064 conserved domain protein 



QRF02066 prophage LambdaSa2, protease, putative 



ORF02067 conserved hypothetical protein 



ORF02068 prophage LambdaSa2 t terminase large subunit, putative 
ORF02069 hypothetical protein 



ORF02070 hypothetical protein 



ORF02071 prophage LambdaSa2 t site-specific recomblnase. phage integrase family 
ORFQ2072 conserved hypothetical protein 



ORF02073 prophage LambdaSa2, transcriptional regulator, Cro/Cl family 



ORF02075 hypothetical protein 



ORF02077 hypothetical protein 



ORF02078 conserved hypothetical protein 



ORF02079 conserved hypothetical protein 



ORF02080 conserved hypothetical protein 



ORF02081 hypothetical protein m — , 

ORF02084 prophage LambdaSa2, bacteriophage replication protein/hypothetical protein, 



truncation/fusion 



ORF02085 hypothetical protein 



ORFQ2087 hypothetical protein 



ORF02088 conserved hypothetical protein 
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IORF02089 prophage LambdaSa2, HNH ndonuclea se family protein 

I ORF02090 prophage LambdaSa2, antirepressor protein, putative 

|ORF02091 conserved domain protein 

1ORF02092 hypothetical protein 



ORFQ2Q93 hypothetical protein 



ORF02094 hypothetical protein . 

ORF02095 prophage LambdaSa2, repressor prote in, putative 



QRF02097 hypothetical protein 



ORF02098 prophage LambdaSa2, site-specific recombinase, phage integrase family 
l ORF021 00 hypothetical protein 



I ORF021 02 hypothetical protein _ 

ORF02103 microcln immunity protein MccF, putative 



ORF02105 oxidoreductase, Gfo/ldh/MocA family 



ORF02108 hypothetical protein 



ORF02109 Cyclic nucleotide-binding domain protein 



ORF02119 hypothetical protein 

ORF02124 hypothetical protein 

ORF02125 nitroreductase family protein 



ORF02134 bacteriocln transport accessory protein, putative 



loRF02148 neuraminidase-related protein 



ORF02160 2\3'-cyclionucleotide 2^phosphodi esterase (cpdB) 



ORF02163 conserved hypothetical protein 



ORF02171 membrane protein, putative 



1ORF02172 hypothetical protein 



1ORF02173 membrane protein, putative 

]ORF02175 conserved hypothetical protein, truncation 



j ORF02181 phosphate transport system regulator y protein PhoU, putative 

QRF021 87 hypothetical protein 

I ORF021 90 conserved hypothetical protein 



lORF02191 hypothetical protein 



1ORF02194 acetyltransferase, GNAT family 



1ORF02196 hypothetical protein 



ORF02198 acetyltransferase, GNAT family 



ORF02201 membrane protein, putative 
1ORF02203 hypothetical protein 



ORF02205 transcriptional regulator. Cro/CI family 
ORF02206 conserved hypothetical protein 



ORF02207 conserved hypothetical protein TIGR00730 



IORF02208 hypothetical protein 



ORF02209 site-specific recombinase, phage integrase family 



loRF02210 conserved hypothetical protein 



|ORF02211 conserved hypothetical protein 



ORF02212 hypothetical protein 



ORF02213 hypothetical protein 



ORF02214 transcriptional regulator. Cro/CI family 



1ORF02215 expressed protein of unknown function 



ORF02216 site-specific recombinase, phage integrase family 



ORF02217 conserved hypothetical protein 



IORF02219 hypothetical protein 



ORF02221 cell wall anchor protein-related protein 



1 ORF02223 hypothetical protein 
ORF02224 hypothetical protein 
IORF02225 hypothetical protein 



ORF02226 membrane protein, putative 



1ORF02227 conjugal transfer protein, interruption-C 
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loRF02230 cons rved hypothetical protein 



ORF02231 conserved hypothetical protein 
1ORF02232 conserved hypothetical protein 
IORF02235 hypothetical protein 



ORF02236 conserved hypothetical protein 



ORF02237 hypothetical protein 



|ORF02238 hypothetical protein 



ORF02239 hypothetical protein 



ORF02240 transcriptional regulator. Cro/CI family 



IORF02241 hypothetical protein 



ORF02242 transcriptional regulator. Cro/CI family 



ORF02243 FtsK/SpolllE family protein 



ORF02244 hypothetical protein 



ORF02245 hypothetical protein 



IORF02246 cell wall surface anchor family protein 
I ORF02247 transposase, 1SL3 family 



1ORF022S0 mercuric resistance operon regulatory protein Me rR (merK) 
1ORF022S1 Mn2+/Fe2+ transporter. NRAMP family 



ORF02252 membrane protein, putative 



ORF02253 ABC transporter, ATP-binding protein 
ORF02254 conserved hypothetical protein 
JORF02255 streptomycin resistance protein 
|ORF02257 hypothetical protein 
IORF02258 hypothetical protein 



loRF02259 conserved hypothetical protein 



joRF0226Q aceWltransferase, GNAT family 



I QRF02261 membrane protein, putative 



1ORF02263 hypothetical protein 



ORF02264 transcriptional regulator. Cro/CI family 



IORF02265 PAP2 family protein 

IQRF02266 conserve d hypothetical protein FRAMESH1FT 



|ORF02267 conserved hypothetical protein TIGR0073Q 
I ORF02268 protease, putative 
|ORF02269 rhodanese family protein 
1 ORF02271 hypothetical protein 

1OR F02274 conserved hypothetical protein — 

JoRF02275 5-methyltetrahvdrofolate-homocysteine methyltransferase, putative 

I ORF02277 conserved hypothetical protein — . _ 

ORF02279 hypothetical protein _ 

IQRF02282 sensor histidine kinase - ^ 



ioRF02283 chromosome assembly-related protein 
|bRF02287 expressed protein of unknown function 
loRF02291 pathogenicity protein, putative 
1ORF02308 hydrolase, haloacid dehalogenase-like family 



ORF02314 conserved hypothetical protein 



IORF02317 hypothetical protein 



IORF02330 hypothetical protein . — 

loRF02344 site-specific recombinase. phage integrase family 
ORF02345 conserved hypothetical protein 
ORF02346 conserved hypothetical protein 



loRF02347 hypothetical protein 



ORF02349 conserved hypothetical protein 



ORF02350 hypothetical protein 



1ORF02351 transcriptional regulator. Cro/CI family 



[ORF02352 conserved domain protein 
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ORF02354 h y pothetical protein . 

ORF02356 expressed putative secreted protein 

ORF02362 sensor histidine kinase . . 

ORF02363 response regulator 

ORF02367 membrane protein, putative 

QRF02368 conserved hypothetical protein 

ORFQ2379 membrane protein, putative 

ORF02395 transcriptional regulator, Cro/CI family 

ORF02406 membrane protein, putative 

ORFQ2416 diacvlglycerol kinase catalytic domain protein, putative 

ORF02418 hypothetical protein 

ORF02422 hypothetical protein _ 

ORF02425 conserved hypothetical protein m . 

ORF03Q01 conserved hypothetical protein m 

ORF03Q04 conserved hypothetical protein 

ORF03005 cylX protein 

ORF03006Tn916. hypothetical protein , 

ORF03007 Tn916, hypothetical protein 

ORF03008Tn916. hypothetical protein 

ORF03009Tn916, tetM leader peptide _ 

ORF03010 Tn916, hypothetical protein . 

ORF03012 prophage LambdaSa2, HNH endonucle ase family protetn 

ORF03013 conserved hypothetical protein m _ 

QRF03015 conjugal transfer protein. tntenruption-N 
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IORFQQQ35 membrane protein, putative 



1ORF00087 lipoprotein, putative 



IORF00088 hypothetical protein 



1ORFQ0Q89 hypothetical protein 



IORF00123 hypothetical protein 



IORF00138 hypotheticalprotein 



1ORF00187 hypothetical protein 



1ORF00188 hypothetical protein 



IORF00192 hypothetical protein 



1ORF00205 hypothetical protein 



IORF00228 lipoprotein, putative 



[ORF00234 hypotheticalprotein 



ORFQQ235 hypothetical protein 



[ORFQ0238 hypotheticalprotein 



ORF00240 transcriptional regulator, Cro/CI family 



IORF00241 hypotheticalprotein 



IORF00242 conserved hypothetical protein 



IORF00243 hypothetical protein 



1ORF00247 hypothetical protein 



IORF00249 hypothetical protein 



IORF00253 hypothetical protein 



1ORF00254 hypothetical protein 



1ORF00255 hypothetical protein 



lORF00256hypothetical protein 



1ORF00257 hypothetical protein 



IORF00258 hypothetical protein 



1ORF00259 hypothetical protein 



1QRF00260 hypothetical protein 



IORF00272 expressed putative lipoprotein 



1ORF00273 hypothetical protein 



I ORF00274 hypothetical protein 



IORF00275 hypothetical protein 



I ORF00276 hypothetical protein 



I ORF00278 membrane protein, putative 
ORF00285 lipoprotein, putative 



IORF00292 hypothetical protein 



IORF00294 expressed protein of unknown function 



IORF00308 conserved hypothetical protein 



IORF00332 hypothetical protein 



1ORF00340 hypothetical protein 



1ORF00384 hypothetical protein 



1ORF00402 membraneprotein, putative 



IORF00408 hypothetical protein 



IORFQ0416 hypothetical protein 



I ORF00417 hypothetical protein 



IORF00448 hypothetical protein 



IORF00476 hypothetical protein 



IORF00489 DNA<lamage-inducible protein J, putative 



IORF00490 hypothetical protein 



I ORF00491 lipoprotein, putativ 



1ORF00497 conserved domain protein 



1ORF00510 bacteriocin transport accessory protein,putative 



I ORF00512 hypothetical protein 



I ORF00527 hypothetical protein 



1ORF00556 hypothetic! protein 
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ORF00575 hypothetical protein 



ORF00599 hypothetical protein 
ORF0061 8 hypothetical protein 



ORF00620 hypothetical protein 



ORF00623 hypothetical protein 



ORF00626 prophage LambdaSal , transcriptional regulator Cro/CI family 



ORF00628 hypothetical protein 



ORF00630 hypothetical protein 



IORF0Q632 hypothetical protein 



IORF0Q635 hypothetical protein 



I ORF00636 hypothetical protein 
IORF0Q63 7 hypothetical protein 



IORFQ0642 conserved hypothetical protein 



IORF00644 hypotheticalprotein 



1ORF00645 hypothetical protein 



ORF00647 hypothetical protein 



IORF00649 hypotheticalprotein 



IORFQ0650 hypotheticalprotein 



1QRF00653 conserved hypothetical protein 



1ORF00657 conserved hypothetical protein, truncation 



ORF00661 conserved hypothetical protein 
[ORF00673 hypothetical protein 
IORF00674 hypothetical protein 



ORF00675 conserved hypothetical protein 



1ORF00676 conserved hypothetical protein 



ORF00682 hypothetical protein 



ORF00685 conserved hypothetical protein 



IQRF00698 hypothetical protein 



iORF00712hypothetical protein 



|ORF00718 cell wall surface protein, interruption-N 



|ORFQ0723 hypothetical protein 



ORF00735 expressed protein of unknown function 



QRF00737 conserved hypothetical protein, degenerate 



IORF00738 hypothetical protein 



IORF00740 hypothetical protein 



IORF00741 hypothetical protein 



ORF00747 cylD protein (cylD) 



I ORF00753 cylE protein (cylE) 



I ORF00756 cylJ protein (cylJ) 



IORF00757 cylK protein (cylK) 



[ORF00758 hypothetical protein 



1ORF00759 putative secreted protein 
IORFQQ761 hypothetical protein 



|ORF00796 hydrolase, haloacld dehalogenase-like family 



1ORF00806 conserved hypothetical protein 



1ORF00822 ABC transporter, ATP-binding protein 



1ORF00827 hypottietical protein 



IORF00872 celi wall surface anchor family protein 



[ORF00909 hypothetical protein 



IQRF00923 hypothetical protein 



IORFQ0924 conserved hypothetical protein 



IORF00942 expressed putative seer ted protein 



ORF00943 hypothetical protein 



I ORF00944 hypothetical protein 



|ORF01013 hypothetical protein 
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1ORF01014 hypothetical protein 



ORF01015 hypothetical protein 



1ORF01016 hypothetical protein 



ORF01018 hypothetical protein 



QRF01019 hypothetical protein 



1ORF01021 hypothetical protein 



1ORF01Q35 Tn916, excisionase 



ORF01062 hypothetical protein 



ORFQ1096 nisin-resistance protein, putative 



|ORF01145 lipoprotein, putative 



ORF01146 conserved hypothetical protein, FRAMESHIFT" 



ORF01148 lipoprotein, putative 
IORF01149 hypothetical protein 



ORFOHSQ hypothetical protein 



ORF01151 hypothetical protein 



ORF01152 lipoprotein, putative 
ORF01153 hypothetical protein 



ORF01158 hypothetical protein 



l ORF01159 hypothetical protein 



1ORF01161 expressed conserved domain protein 



|ORF01162 conserved hypothetical protein 



1ORF01166 hypothetical protein 



ORF01168 conserved hypothetical protein 



IORF01169 hypothetical protein 



|ORF01174 conserved domain protein 



ORF01 175 hypothetical protein 

,ORF01186 cell wall surface anchor famiiy protein, putative 



ORF01187 hypothetical protein 



[ORFQ1204 hypothetical protein 



ORF01215 hypothetical protein 



1ORF01258 hypothetical protein 



ORF01262 conserved hypothetical protein, FRAMESHIFT 



IORF01263 hypothetical protein 



1ORF01265 hypothetical protein 



1ORF01266 hypothetical protein 



pRF01304 polysaccharide biosynthesis protein CpsK(V) (cpsK) 



ORF01308 polysaccharide biosynthesis protein CpsM(V) (cpsM) 



ORF01309 polysaccharide biosynthesis protein cpsH(V) (cpsH) 



ORF01349 hypothetical protein 
IORF01384 hypothetical protein 



loRF01385 hypothetical protein 



[ORF01386 conserved hypothetical protein 



ORF01392 hypothetical protein 



jORF01395 conserved hypothetical protein 



ORF01 409 conserved hypothetical protein 



ORF01410 hypothetical protein 



ORF01417 hypothetical protein 



ORFQ1418 hypothetical protein" 
ORF01420 hypothetical protein 



ORF01423 conserved hypothetical protein 



ORF01424 conserved hypothetical protein 



ORFQ1425 conserved hypothetical protein 



ORF01426 conserved hypothetical protein 



ORF01427 hypothetical protein 



IORF01431 hypothetical protein 
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ORF01432 conserved domain protein 



ORF01434 hypothetical protein 



ORF01435 calcium-binding protein, putative 



ORF01437 abortive infection protein AbiGl (abiGl) 



ORF01438 abortive infection protein AbiGII (abiGll) 



QRF01441 conserved hypothetical protein, degenerate 



ORF01443 hypothetical protein 



ORF01445 hypothetical protein 



QRF01452 hypothetical protein 



ORF01459 hypothetical protein 



ORF01463 hypothetical protein 



ORF01464 hypothetical protein 



ORF01465 hypothetical protein 



ORF01486 hypothetical protein 



QRF01488 R5 protein 



ORF01575 membrane protein, putative 



ORF01581 lipoprotein, putative 



QRF01601 hypothetical protein 



ORF01611 hypothetical protein 



ORF01638 conserved hypothetical protein 



ORF01645 cell wall surface anchor family protein 



ORF01660 membrane protein, putative 



ORF01666 hypothetical protein 



ORF01667 hypothetical protein 



ORF01670 hypothetical protein 



ORF01673 hypothetical protein 



QRF01674 hypothetical protein 



ORF01675 hypothetical protein 



ORF01681 hypothetical protein 



ORF01682 hypothetical protein 



ORF01684 hypothetical protein 



ORF01719 hypothetical protein 



ORF01733 hypothetical protein 



ORF01735 hypothetical protein 



ORF01736 hypothetical protein 



QRFQ1737 hypothetical protein 



ORF01742 hypothetical protein 



ORF01754 hypothetical protein 



ORFQ1761 transposase, IS3Q family, putative, truncation 



QRFQ1807 hypothetical protein 



ORF01836 hypothetical protein 



ORF01838 hypothetical protein 



ORF01842 hypothetical protein 



ORF0185Q hypothetical protein 



ORF01892 hypothetical protein 



ORF01893 hypothetical protein 



QRF01895 hypothetical protein 



QRF01896 hypothetical protein 



ORF01897 hypothetical protein 



ORF01898 hypothetical protein 



ORF01899 hypothetical protein 



ORF01905 hypothetical protein 



ORF01934 hypothetical protein 



ORF01943 hypothetical protein 



ORF01944 hypothetical protein 
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IORF01953 hypothetical protein 



I ORF01984 hypothetical protein 



IQRF01988 hypothetical protein 



1 ORF01989 hypothetical protein 



IORF02005 hypothetical protein 
1ORF02011 hypothetical proteirT 



IORF02016 hypothetical protein 



ORF02025 hypothetical protein 

1ORF02026 hypothetical protein 

ORF02045 conserved hypothetical protein 

1ORF02047 prophage LambdaSa2. holin, putative" 



IORF02048 conserved hypotheticalprotein 



IORF02049 hypothetical protein 



|ORF02Q50conserved domain protein 



IORF02053 conserved hypothetical protein 



[ORF02057 hypothetical protein 



I ORF02058 hypothetical protein 



I QRF02061 hypothetical protein 



IORFQ2062 hypothetical protein 



IORF02063 conserveddomain protein 



IORF02067 conserved hypothetical protein 



[ORF02069 hypothetical protein 



[ORF0207(rhypothetical protein 



IORF02072 conserved hypothetical protein 



ORF02073 prophage LambdaSa2, transcriptional regulator, Cro/CI family 



jORF02075 hypothetical protein 



IORF02077 hypothetical protein 



IORF02078 conserved hypothetical protein 



[ORF02081 hypothetical protein 



IORF02085 hypothetical protein 



IORF02087 hypothetical protein 



1ORF02Q88 conserved hypothetical protein 



IORF02091 conserved domain protein 



IORF02092 hypothetical protein 



IORF02093hypothetical protein 



I ORF02094 hypothetical protein 



IORF02097 hypothetical protein 



1ORF021Q0 hypothetical protein 



IORF02102 hypothetical protein 



IORF021Q8 hypothetical protein 



|ORF02119hypothetical protein 



I ORF02124 hypothetical protein 



1ORF02171 membrane protein, putative 



IORF02172 hypothetical protein 



IORF02173 membrane protein, putative 



I ORF02191 hypothetical protein 
ORF02196 hypothetical protein" 



IORF02203 hypothetical protein 



ORF02208 hypothetical protein 



[ORF02212 hypothetical protein 



ORF02213 hypothetical protein 



I ORF02214 transcriptional regulator, Cro/CI family 



ORF02215 expressed protein of unknown function 



IORF02217 conserved hypothetical protein 



IORF02219 hypothetical protein 
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abl 12: GBS ORF' not shared with GAS, pn urn coccus or any publish d 



g n m 



ORFxxxxx Ann tati n 



ORF02221 cell wall anchor protein-related protein 



ORF02223 hypothetical protein 



QRF02224 hypothetical protein 



ORF02225 hypothetical protein 



ORF02231 conserved hypothetical protein 



ORF02235 hypothetical protein 



ORF02236 conserved hypothetical protein 



ORF02237 hypothetical protein 



ORF02238 hypothetical protein 



ORF02239 hypothetical protein 



ORF02241 hypothetical protein 



ORF02244 hypothetical protein 



ORFQ2245 hypothetical protein 



ORF02263 hypothetical protein 



ORF02268 protease, putative 



ORF02271 hypothetical protein 



ORF02279 hypothetical protein 



ORF02283 chromosome assembly-related protein 



ORF02317 hypothetical protein 



ORF02330 hypothetical protein 



ORF02344 site-specific recombinase, phage integrase family 



ORF02345 conserved hypothetical protein 



ORF02347 hypothetical protein 



ORF02349 conserved hypothetical protein 



ORF02350 hypothetical protein 



ORF02351 transcriptional regulator, Cro/CI family 



ORF02354 hypothetical protein 



ORF02356 expressed putative secreted protein 



QRF02395 transcriptional regulator, Cro/CI family 



QRF02418 hypothetical protein 



ORF02422 hypothetical protein 



ORF02425 conserved hypothetical protein 



ORF03004 conserved hypothetical protein 



ORF030Q5 cylX protein 



ORF03006 Tn916, hypothetical protein 



ORF03007 Tn916, hypothetical protein 



QRF03008 Tn916, hypothetical protein 



ORF03009 Tn916, tetM leader peptide 



ORF0301Q Tn916, hypothetical protein 



ORF03015 conjugal transfer protein, interruption-N 
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Table 13: Comparative Sequences relating to SAG0466 (oRolase) 

SEQ ID NO 1301: SAG0466 FROM THE 2603V/R GBS STRAIN 

CTCCTGCCCCTGCAATGGCAGTTAGACCCATAGGTTTATTTTTATA^ 

ATTCCTGAGCAGGCATAAGGGTGTCCGTAAGCTAA^ 

atgatStagagcatcaatcgctgcaaatggttcattccattc^^ 

CTGTTAATAGTTTTTCCGTAGCCGTGTGAACCAATTCTGGACTAAGCTTGGGATCTCCTGCTACTTCTACAATGTGAACA 
ATCCGGAATTCTGTTTTCTGACTCTGAAGCGTTAGAAATGCAGCAGCATCGTGCATTAAACAAACATTTCCAATAGTGAG 
rAAAGGTGAATTTTCCATCAATCTTGGTAATTTTTGAAAAAATGTTtCTTTTaGTTTTCTAACGCCTTGATCTCGCATCC 
CT^CATTGGTAAGATTACyTCTTCTAAATAGCCACCTTGTTTAGCTGTTAAGGCGCGTTTATGGCTCAAGAATGCCAAT 
TTATCTAACATTTCTCTTCTAAAaCCATATTTTTGACAGACTCTCTGGGCCCCTTCTAACATTACAGTTTCAGCATAAGA 

GTCAGGAGAAAACTGAGCAACTGTATATTCTCCGTTACG 
TACTTTCAATCCCCCCAACAAGAACTTTTTCATTAA^ 

GATGAAGCACACTGCATATCAATCGTTTGTACTGGAATATAGGATTCATAATCAGAAAAAAGAGTCATCAAACGACCAAT 

StgSccSaotaccaactgtgttcccac^ 
aaaggtgtgctcctaaaagttctggacggtaagtttaaattgctt 

seo id no 1302: sag0466 from the m732 gbs type iii strain 

TrrrTATAAAAGGGAAGCAATTTAAACATTACCGTCCAGAACTTTTAGGAGCACACCTCTTAAATCAAATAAAAAAAATA 
GftATCAGAATCTAATA 

TTT^TCTGATTATGZATCCTATMTCCAGTACA 

GTTATCTAAAAATCAGTGCCGGTATTAATGAAAAAGTTCTTGTTGGGGGGATTGAAAGTAGTTCTCTTCAACCTATGAGA 

cSacgc^aaaS^^ 
ottISaIggg^c^gagtctgtcaaaaata^ 

AGAAARCTAAAAGAAGCATTTTTTCAAAAATTACCAAGATTGATGGAAAATTCACCTTTGCTCACTATTGGAAA 

??t^gcac^tgctgctgcatttctaacg 

CAGGAGATCCCAAGCTTAGTCCAGAATTGGTTCACACGGCTACGGAAAAACTATTAA^ 

ga^tgaScaattgaatggaatgaaccattt 

AAAATTCAATATTTTTGGAGGGGCATTAGCTTACGGACACCCTTATGCCTGCTCAGGAATTA 
SEO ID NO 1303: SAG0466 FROM THE 090 GBS TYPE la STRAIN 

TTGTGGGAACACAGTTGGTACTGGGGGCAATATTGGTCGTTTGATGACTCTTTTTTCTGATTATGAATCCTATATTCCAG 
ScAAACGi^S 

GAAAAAGTTCTTGTTGGGGGGATTGAAAGTAGTTCTCT 

AGAATATACCGTTTCTCAGTTTTCTCCTGACTCTTAkGCTGAAACTGTAATGtTAGAAGGGGCACAA^ 

aatItggtttSgaagagaaatgttagataaattggcat 

TATTTAGAAGAGGTAATCTTACCAATOT^ 

ATTACCflAGATTGATGGrAAATTCACCTTTGCTCACTATTGGAAATGTTTGTTTAATGCACGATGCTGCTGCAT^ 
CGCTTCAGAGTCAGAAAACAGAATTCCGGATTGTTCACATTGTAGAAGTAGCAGGAGATCCCAAGCTTAGTCCAGAATTG 

GTTCACACGGCTACGGAAAAACTATTAACAGAAACTCATACTAA2^TATCGGATTATGA^ 

ATTTGCAGCGATTGATGCTTTATTTAATCATTATTATCCTGAAGAGAGAGAAAAATTCAATATTTTTGGAGGGGCATTAG 
CTTACGGACACCCTTATGCCTGCTCAGG 

SEQ ID NO. 1304: SAG0466 FROM THE COH1 GBS TYPE la STRAIN 

TTTTTCTGATTATGAATCCTATATTCCAGTACAAACGATTGATATGCAGTGTGCT 
GGTATCTAAAAA 

SEQ ID NO. 1305 : SAG0466 FROM THE COB GBS NONTYPEABLE STRAIN (REVERSE 

GGGCATTAGCTTACGGACACCCTTAATGCCTGCTCAGGAATTATTAATATCC 



SEQ ID NO. 1306: sag0466 FROM THE CJB110 GBS NONTYPEAB1E STRAIN ^^.^^ 

GGTATAAAAGGGAAGCAATTTAAACATTACCGTCCAGAACTTTTAGGAGCACACCTCTTAAATC^ 

ACCAGAATCTAACATTGATAGTATTATTTGTGGGAACACAGTTGGTACTGGGGGCAATATTGGTCGTTTGATGACTCTTT 

TTTCTGATTATGAATCCTATATTC 
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Table 13: Comparative Sequences relating to SAG0466 (CTxolase) 

SEQ ID NO. 1307: SAG0466 FROM THE 1169NT1 GBS TYPE V STRAIN REVERSE COMPLEMENT 

CAAGATTGATGGAAAATTCACCTTTGCTCACTATTGGAAATGTTTGTTTAATGCACGATGCTGCTGCATTTCTAACGCTT 

CAGAGTCAGAAAACAGAATTCCGGATTGTTCACATTGTAGAAGTAGCAGGAGATCCCAAGCTTAGTCCAGAATTGGTTCA 

CACGGCTACGGAAAAACTATTAACAGAAACTCATACTAAAATATCGGATTATGATGCAATTGAATGGAATGAACCATTTG 

CAGCGATTGATGCTTTATTTAATCATTATTATCCTGAAGAGAGAGAAAAATTCAATATTTTTGGAGGGGCATTAGCTTAC 

GGACACCCTTATGCCTGCTCAGGAATTATTAATATCCTTCATCTTATGCAGGCATTAAAATATAAAAATAAACCTATGGG 

CCTAACTGCCATTGCAGGGGCA 

SEQ ID NO. 1308: SAG0466 FROM THE 18RS21 GBS TYPE II STRAIN 

CCTTAACAGTTAAACAAGGTGGCTATTTAGAAGAGGTAATCTTACCAATGGAAGGGATGCGAGATCAAGGCGTTAGAAAA 

CTAAAAGAAACATTTTTTCAAAAATTACCAAGATTGATGGAAAATTCACCTTTGCTCACTATTGGAAATGTTTGTTTAAT 
GCACGATGCTGCTGCATTTCTAACGCTTCAGAGTCAGAAAACAGAATTCCGGATTGTTCACAT^ 

ATCCCAAGCTTAGTCCAGAATTGGTTCACACGGCTACGGAAAAACTATTAACAGAAACTCATACTAAAATATCGGATTAT 
GATGCAATTGAATGGAATGAACCATT^^ 

CAATATTTTTGGAGGGACATTAGCTTACGGACACCCTTATGCCTGCTCAGGAATTATTAATATCCTTCATCTTATGCAGG 
CATTAAAATATAAAAATAAACCTATGGGTCTAACTGCCATTGCAGGGGCAG 

SEQ ID NO. 1309: SAG0466 FROM THE 18RS21 GBS TYPE II STRAIN 

TCGGTATAAAAGGGAAGCAATTTAAACATTACCGTCCAGAACTTTTAGGAGCACACCTTTTAAATCAAATAAAAAAAATA 
GAATCAGAATCTAACATTGATAGTATTATTTGTGGGAACACAGTTGGTACTGGGGGCAATATTGGTCGTTTGATGACTCT 
TTTTTCTGATTATGAATCCTATATTCCAGTACAAACGATTGATATGCAGTGTGCTTCATCAAGTTCTGCCTTGTTTTTTG 
GTTATCTAAAAATCAGTACCGGTATTAATGAAAAAGTTCTTGTTGGGGGGATTGAAAGTAGTTCTCTTCAACCTATGAGA 
CGTTATGCTAAAGAAGATAATCGTAACGGAGAATATACAGTTGCTCAGTTTTCTCCTGACTCTTATGCTGAAACTGTAAT 
GTTAGAAGGGGCCCAGAGAGTCTGTCAAAAATATGGTTTTAGAAGAGAAATGTTAGATAAATTGGCATTCTTGAGCCATA 

AACGCGCCTTAACAGCTAAACA 

SEQ ID NO. 1310: SAG0466 FROM THE H36b GBS TYPE lb STRAIN 




AAATCAAATAAAAAAAATAGAATCAGAATCTAACATTGATAGTATTATTTGTGGGAACACAGTTGGTACX^O^^aaxa 
TTGGTCGTTTGATGACTCTTTTTTCTGATTATGAATCCTATATTCCAGTACAAACGATTGATATGCAGTGTGCTTCATCA 
AGTTCTGCCTTGTTTTTTGGTTATCTAAAAATCAGTACCGGTATTAATGAAAAAGTTCTTGTTGGGGGGATTGAAAGTAG 
TTCTCTTCAACCTATGAGACGTTATGCTAAAGAAGATAATCGTAACGGAGAATATACAGTTGCTCAGTTTTCTCCTGACT 

CTTATGCTGAAACTGTAATGTTAGAAGGGGCCC 

SEQ ID NO. 1311: SAG0466 FROM THE H36b GBS TYPE lb STRAIN (REVERSE COMPLEMENT) 

GRAAATTCACCTTTGCTCACTATTGGAAATGTTTGTTTAATGCACGATGCTGCTGCATTTCTAACGCTTCAGAGTCAGAA 

AACAGAATTCCGGATTGTTCACATTGTAGAAGTAGCAGGAGATCCCAAGCTTAGTCCAGAATTGGTTCACACGGCTACGG 

AAAAACTATTAACAGAAACTCATACTAAAATATCGGATTATGATGCAATTGAATGGAATGAACCATTTGCAGCGATTGAT 

GCTCTATTTAATCATTATTATCCTGAAGAGAGAGAAAAATTCAATATTTTTGGAGGGACATTAGCTTACGGACACCCTTA 

TGCCTGCTCAGGAATTATTAATATCCTTCATCTTATGCAGGCATTAAAATATAAAAATAAACCTATGGGTCTAACTGCCA 

TTGCAGGGGCAGGA 

SEQ ID NO. 1312: SAG0466 FROM THE M781 GBS TYPE III STRAIN (REVERSE COMPLEMENT) 
CCTTTGCTCACTATTGGAAATGTTTGTTTAATGCACGATGCTGCTGCATTTCTAACGCTTCAGAGTCAGAAAACAGAATT 
CCGGATTGTTCACATTGTAGAAGTAGCAGGAGATCCCAAGCTTAGTCCAGAATTGGTTCACACGGCTACGGAAAAACTAT 
TAACAGAAACTCATACTAAAATATCGGATTATGATGCAATTGAATGGAATGAACCATTTGCAGCGATTGATGCTTTATTT 
AATCATTATTATCCTGAAGAGAGAGAAAAATTCAATATTTTTGGAGGGGCATTAGCTTACGGACACCCTTATGCCTGCTC 
AGGAATTATTAATATCCTTCATCTTATGCAGGCATTAAAATATAAAAATAAACCTATGGGTTCTAACTGC 



SEQ ID NO. 1313: SAG0466 FROM THE M781 GBS TYPE III STRAIN 

GCAATTTAAACATTACCGTCC^GAACTTTTAGGAGCACACCTCTTAAATCAAATAAAAAAAATAGAATCAGAATCTAATA 
TTGATAGTATTATTTGTGGGAACACAGTTGGTACTGGGGGCAATATTGGTCGTTTGATGACTCTTTTTTCTGATTATGAA 
TCCTATATTCCAGTACAAACGATTGATATGCAGTGTGCTTCATCAAGTTCTGCCTTGTTTTTTGGTTATCTAAAAATCAG 
TGCCGGTATTAATGAAAAAGTTCTTGTTGGGGGGATTGAAAGTAGTTCTCTTCAACCTATGAGACGTTACGCTAAAGAAG 
ATAATCGTAACGGAGAATATACCGTTGCTCAGTTTTCTCCTGACTCTTATGCTGAAACTGTAATGTTAGA 

SEQ ID NO 1314: SAG0466 FROM THE A909 GBS TYPE la STRAIN (REVERSE COMPLEMENT) 

CCTTTGCTCACTATTGGAAATGTTTGTTTAATGCACGATGCTGCTGCATTTCTAACGCTTCAGAGTCAGAAAACAGAATT 

CCGGATTGTTCACATTGTAGAAGTAGCAGGAGATCCCAAGCTTAGTCCAGAATTGGTTCACACGGCTACGGAAAAACTAT 

TAACAGAAACTCATACTAAAATATCGGATTATGATGCAATTGAATGGAATGAACCATTTGCAGCGATTGATGCTCTATTT 

AATCATTATTATCCTGAAGAGAGAGAAAAATTCAATATTTTTGGAGGGACATTAGCTTACGGACACCCTTATGCCTGCTC 

AGGAATTATTAATATCCTTCATCTTATGCAGGCATTAAAATATAAAAATAAACCTATGGGTCTAACTGCCATTGCAGGGG 

r 
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Table 13: C mparaClve Sequences relating to SAG0466 ^^>Xase) 

S^^D NO. 1315: SAG0466 FROM THE JM9130013 GBS TYPE VIII STRAIN REVERSE 
COMPLEMENT 

GCTCACTATTGGAAATGTTTGTTTJUVTGCACGATGCTGCTGCATTTCTAACGCTTCAGAGTCAGAAAACAGAATTCCGGA 
TTGTTCACATTGTAGAAGTAGCAGGAGATCCCAAGCTTAGTCCAG7\ATTGGTTCACACGGCTACGGAAAAACTATTAACA 
GAAACTCATACTAAAATATCGGATTATGATGCAATTGAATGGAATGAACCATTTGCAGCGATTGATGCTCTATTTAATCA 
TTATTATCCTGAAGAGAGAGAAAAATTCAATATTTTTGGAGGGGCATTAGCTTACGGACACCCTTATGCCTGCTCAGGAA 
TTATTAATATCCTTCATCTTATGCAGGCATTAAAATATAAAAATAAACCTATGGGTCTAACTGCCATTGCAGGGGCAGGA 

SEQ ID NO. 1316: SAG0466 FROM THE JM9130013 GBS TYPE VIII STRAIN 

TTTGGGCTACGAACACCTATCGGTATAAAAGGGAAGCAATTTAAACATTACCGTCCAGAACTTTTAGGAGCACACCTTTT 
AAATCAAATAAAAAAAATAGAATCAGAATCTAACATTGATAGTATTATTTGTGGGAACACAGTTGGTACTGGGGGCAATA 
TTGGTCGTTTGATGACTCTTTTTTCTGATTATGAATCCTATATTCCAGTACAAACGATTGATATGCAGTGTGCTTCATCA 
AGTTCTGCCTTGTTTTTTGGTTATCTAAAAATCAGTACCGGTATTAATGAAAAAGTTCTTGTTGGGGGGATYGT^AAGTAG 
TTCTCTTCAACCTATGAGACGTTATGCTAAAGAAGATAATCGTAACGGAGAATATA 

SEQ1301 CTCCTGCCCCTGCAATGGCAGTTAGACCCATAGGTTTATTTTTATATTTTA 

SEQ1302 

SEQ1303 

SEQ1304 'J 

SEQ1305 

SEQ1306 

SEQ1307 

SEQ1308 CTTAACAGTTAAACAAGGTGGCTATTTAGAAGAGGTAATCTTACCAATGGAAGGGATGC 

SEQ1309 

SEQ1310 

SEQ1311 

SEQ1312 

SEQ1313 

SEQ1314 

SEQ1315 

SEQ1316 

SEQ1301 TGCCTGCATAAGATGAAGGATATTAATAATTCCTGAGCAGGCATAAGGGTGTCCGTAAG 

SEQ1302 TCGGTATAAA 

SEQ1303 

SEQ1304 ATCGGTATAAA 

SEQ1305 TTTTCAAAAATTACCAAGATTGATGG 

SEQ1306 GGTATAAA 

SEQ1307 CAAGATTGATGG 

SEQ1308 AGATCAAGGCGT TAGAAAACTAAAAGAAACATTTTTTC AAAAAT TACCAAG ATTGATGG 

SEQ1309 TCGGTATAAA 

SEQ1310 TTTGGGCTACGAACACCTATCGGTATAAA 

SEQ1311 G 

SEQ1312 

SEQ1313 

SEQ1314 

SEQ1315 

• SEQ1316 TTTGGGCTACGAACACCTATCGGTATAAA 

SEQ1301 TAATGTCCCTCCAAA-AATATTGAATTTTTCTCTCTC-TTCAGGATAATAATGATTAAA 
SEQ1302 GGGAAGCAATTT AAACATTACCGT CCAGAACTTT T AGGAGC ACACCTCTTAAATCAAAT 

SEQ1303 

SEQ1304 GGGAAGCAATTTAAA-ATTACCGTCCAGAACTTTTAGGAGCACACCTCTTAAATCAAAT 

SEQ1305 AAATTCACCTTTGCTCACTATTGGAAATGTTTGTTTAATGCACGATGCTGCTGCATTTC 

SEQ1306 GGGAAGCAATTTAAACATTACCGTCCAGAACTTTTAGGAGCACACCTCTTAAATCAAAT 

SEQ1307 AAATTCACCTTTGCTCACTATTGGAAATGTTTGTTTAATGCACGATGCTGCTGCATTTC 

SEQ1308 AAATTCACCTTTGCTCACTATTGGAAATGTTTGTTTAATGCACGATGCTGCTGCATTTC 

SEQ13 0 9 GGGAAGCAATTTAAACATTACCGTCCAGAACTTTTAGGAGCACACCTTTTAAATCAAAT 

SEQ1310 GGGAAGCAATTTAAACATTACCGTCCAGAACTTTTAGGAGCACACCTTTTAAATCAAAT 

SEQ1311 AAATTCACCTTTGCTCACTATTGGAAATGTTTGTTTAATGCACGATGCTGCTGCATTTC 

SEQ1312 CCTTTGCTCACTATTGGAAATGTTTGTTTAATGCACGATGCTGCTGCATTTC 

SEQ1313 GCAATTTAAACATTACCGTCCAGAACTTTTAGGAGCACACCTCTTAAATCAAAT 

SEQ1314 CCTTTGCTCACTATTGGAAATGTTTGTTTAATGCACGATGCTGCTGCATTTC 

SEQ1315 GCTCACTATTGGAAATGTTTGTTTAATGCACGATGCTGCTGCATTTC 
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SEQ1301 
SEQ1302 
SEQ1303 
SEQ1304 
SEQ1305 
SEQ1306 
SEQ1307 
SEQ1308 
SEQ1309 
SEQ1310 
SEQ1311 
SEQ1312 
SEQ1313 
SEQ1314 
SEQ1315 
SEQ1316 

SEQ1301 
SEQ1302 
SEQ1303 
SEQ1304 
SEQ1305 
SEQ1306 
SEQ1307 
SEQ1308 
SEQ1309 
SEQ1310 
SEQ1311 
SEQ1312 
SEQ1313 
SEQ1314 
SEQ1315 
SEQ1316 

SEQ1301 
SEQ1302 
SEQ1303 
SEQ1304 
SEQ1305 
SEQ1306 
SEQ1307 
SEQ1308 
SEQ1309 
SEQ1310 
SEQ1311 
SEQ1312 
SEQ1313 
SEQ1314 
SEQ1315 
SEQ1316 



GGGAAGCAATTTAAACATTACCGTCCAGAACXTTTAGGAGCACACCTTTTAAATCAAAT 

AGAGCATCAATCGCTGCAAATGGTTCATTCC-ATTCAATTGCATCATAATCCGATATTT 
AAAAAAATAGAATCAGAATCTAAT — ATT GATAGTATTATTTGTGGGAACA-CAGT 



AAAAAAATAGAATCAGAATCTAAT — ATT GATAGTATTATTTGTGGGAACA-CAGT 

AACGCTTCAGAGTCAGAAAACAGA — ATTCCGGATTGTTCACATTGTAGAAGTAGCAGG 

AAAAAAATATAACCAGAATCTAAC — ATT GATAGTATTATTTGTGGGAACA-CAGT 

AACGCTTCAGAGTCAGAAAACAGA — ATTCCGGATTGTTCACATTGTAGAAGTAGCAGG 
AACGCTTCAGAGTCAGAAAACAGA — ATTCCGGATTGTTCACATTGTAGAAGTAGCAGG 

AAAAAAATAGAATCAGAATCTAAC — ATT GATAGTATTATTTGTGGGAACA-CAGT 

AAAAAAATAGAATCAGAATCTAAC — ATT GATAGTATTATTTGTGGGAACA-CAGT 

AACGCTTCAGAGTCAGAAAACAGA — ATTCCGGATTGTTCACATTGTAGAAGTAGCAGG 
AACGCTTCAGAGTCAGAAAACAGA — ATTCCGGATTGTTCACATTGTAGAAGTAGCAGG 

AAAAAAATAGAATCAGAATCTAAT — ATT GATAGTATTATTTGTGGGAACA-CAGT 

AACGCTTCAGAGTCAGAAAACAGA — ATTCCGGATTGTTCACATTGTAGAAGTAGCAGG 
AACGCTTCAGAGTCAGAAAACAGA — ATTCCGGAT TGT TCACATTGTAGAAGT AGCAGG 
AAAAAAATAGAATCAGAATCTAAC — ATT GATAGTATTATTTGTGGGAACA-CAGT 

AGTATGAGTTTCTGTTAATAGTTTTTCCGTAGCCGTGTGAACCAATTCTGGACTAAGCT 
GGTACTGGGGGCAATATTGG-TCGTTTGATGACTCTTTTTTCTGATTATGAATCCTA — 
GGTACTGGGGGCAATATTGG-TCGTTTGATGACTCTTTTTTCTGATTATGAATCCTA — 
GGTACTGGGGGCAATATTGG-TCGTTTGATGACTCTTTTTTCTGATTATGAATCCTA — 
GATCCCAAGCTTAGTCCAGAATTGGTTCACACGGCTACGGAAAAACTATTAACAGAAAC 
GGTACTGGGGGCAATATTGG-TCGTTTGATGACTCTTTTTTCTGATTATGAATCCTA — 
GATCCCAAGCTTAGTCCAGAATTGGTTCACACGGCTACGGAAAAACTATTAACAGAAAC 
GATCCCAAGCTTAGTCCAGAATTGGTTCACACGGCTACGGAAAAACTATTAACAGAAAC 
GGTACTGGGGGCAATATTGG-TCGTTTGATGACTCTTTTTTCTGATTATGAATCCTA — 
GGTACTGGGGGCAATATTGG-TCGTTTGATGACTCTTTTTTCTGATTATGAATCCTA — 
GATCCCAAGCTTAGTCCAGAATTGGTTCACACGGCTACGGAAAAACTATTAACAGAAAC 
GATCCCAAGCTTAGTCCAGAATTGGTTCACACGGCTACGGAAAAACTATTAACAGAAAC 
GGTACTGGGGGCAATATTGG-TCGTTTGATGACTCTTTTTTCTGATTATGAATCCTA — 
GATCCCAAGCTTAGTCCAGAATTGGTTCACACGGCTACGGAAAAACTATTAACAGAAAC 
GATCCCAAGCTTAGTCCAG7VATTGGTTCACACGGCTACGGAAAAACTATTAACAGAAAC 
GGTACTGGGGGCAATATTGG-TCGTTTGATGACTCTTTTTTCTGATTATGAATCCTA — 



GGGATCTCCTGCTACTTCTACAATGTGAACAATCCGGA-ATTCTGTTTTCTGACTCTGA 
TATTCCAGTACAAACGATTGATATGCAGTGTGCTTCATCAAGT — TCTGCCTTGTTTTT 
TATTCCAGTACAAACGATTGATATGCAGTGTGCTTCATCAAGT — TCTGCCTTGTTTTT 
T AT TCCAGTACAAACGATTGATATGCAGT GTGCTTCATCAAGT — TCTGCCTTGTTTTT 
CATACTAAAATATCGGATTATGATGCAATTGAATGGAATGAACCATTTGCAGCGATTGA 

TATTC 

CATACTAAAATATCGGATTATGATGCAATTGAATGGAATGAACCATTTGCAGCGATTGA 

CATACTAAAATATCGGATTATGATGCAATTGAATGGAATGAACCATTTGCAGCGATTGA 
TATTCCAGTACAAACGATTGATATGCAGTGTGCTTCATCAAGT — TCTGCCTTGTTTTT 
TAT TCC AGT ACAAACGAT TGATATGCAGTGTGCTTCATCAAGT — TCTGCCTTGTTTTT 
CATACTAAAATATCGGATTATGATGCAATTGAATGGAATGAACCATTTGCAGCGATTGA 
CATACTT^AAATATCGGATTATGATGCAATTGAATGGAATGAACCATTTGCAGCGATTGA 
TATTCCAGTACAAACGATTGATATGCAGTGTGCTTCATCAAGT — TCTGCCTTGTTTTT 
CATACTAAAATATCGGATTATGATGCAATTGAATGGAATGAACCATTTGCAGCGATTGA 
CATACTAAAATATCGGATTATGATGCAATTGAATGGAATGAACCATTTGCAGCGATTGA 
TATTCCAGTACAAACGATTGATATGCAGTGTGCTTCATCAAGT — TCTGCCTTGTTTTT 
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SE^^l 

SEQ1302 

SEQ1303 

SEQ1304 

SEQ1305 

SEQ1306 

SEQ1307 

SEQ1308 

SEQ1309 

SEQ1310 

SEQ1311 

SEQ1312 

SEQ1313 

SEQ1314 

SEQ1315 

SEQ1316 

SEQ1301 

SEQ1302 

SEQ1303 

SEQ1304 

SEQ1305 

SEQ1306 

SEQ1307 

SEQ1308 

SEQ1309 

SEQ1310 

SEQ1311 

SEQ1312 

SEQ1313 

SEQ1314 

SEQ1315 

SEQ1316 

SEQ1301 

SEQ1302 

SEQ1303 

SEQ1304 

SEQ1305 

SEQ1306 

SEQ1307 

SEQ1308 

SEQ1309 

SEQ1310 

SEQ1311 

SEQ1312 

SEQ1313 

SEQ1314 

SEQ1315 

SEQ1316 



GCGTTAGAAATGCAGCAGCATCGTGCATTAAACAAACATTTC— CAATAGTGAGCAAAG 
GGT-TATCTAAAAATCAGTG-CCGGTATTAATGAAAAAGTTCTTGTTGGGGGGATTGAA 
GGT-TATCTAAAAATCAGTG-CCGGTATTAATGAAAAAGTTCTTGTTGGGGGGATTGAA 

GGG-TATCTAAAAA ™ 

GCTTT ATT T AATCATTAT T ATCCTGAAGAGAGAGAAAAATTCAAT ATTTTTGGAGGGGC 

GCTTTATTTAATCATTATTATCCTGAAGAGAGAGAAAAATTCAATATTTTTGGAGGGGC 
GCTCTATTTAATCATTATTATCCTGAAGAGAGAGAAAAATTCAATATTTTTGGAGGGAC 
GGT-TATCTAAAAATCAGTA-CCGGTATTAATGAAAAAGTTCTTGTTGGGGGGATTGAA 
GGT-TATCTAAAAATCAGTA-CCGGTATTAATGAAAAAGTTCTTGTTGGGGGGATTGAA 
GCTCTATTTAATCATTATTATCCTGAAGAGAGAGAAAAATTCAATATTTTTGGAGGGAC 
GCTTTATTTAATCATTATTATCCTGAAGAGAGAGAAAAATTCAATATTTTTGGAGGGGC 
GGT-TATCTAAAAATCAGTG-CCGGTATTAATGAAAAAGTTCTTGTTGGGGGGATTGAA 
GCT CT ATT T AATC ATTATTATCCTGAAGAG AGAGAAAAATTCAAT AT TTTTGGAGGGAC 
GCTCTATTTAATCATTATTATCCTGAAGAGAGAGAAAAATTCAATATTTTTGGAGGGGC 
GGT-TATCTAAAAATCAGTA-CCGGTATTAATGAAAAAGTTCTTGTTGGGGGGATTGAA 

TGAATTTTCCATCAATCTTGG — TAATTTTTGAAAAAATGTTTCTTTTAGTTTTCTAAC 
GTAGTTCTCTTCAACCTATGAGACGTTACGCTAAAGAAGATAATCGTAACGGAGAATAT 
GTAGTTCTCTTCAACCTATGAGACGTTACGCTAAAGAAGATAATCGTAACGGAGAATAT 



TTAGCTTACGGACACCCTTAA — TGCCTGCTCAGGAATTATTAATATCC 

TTAGCTTACGGACACCCTTA TGCCTGCTCAGGAATTATTAATATCCTTCATCTTAT 

TTAGCT TACGGACACCCTTA TGCCTGCTCAGGAATTATTAATATCCTTCATCTTAT 

GTAGTTCTCTTCAACCTATGAGACGTTATGCTAAAGAAGATAATCGTAACGGAGAATAT 
GTAGTTCTCTTCAACCTATGAGACGTTATGCTAAAGAAGATAATCGTAACGGAGAATAT 

TTAGCTTACGGACACCCTTA TGCCTGCTCAGGAATTATTAATATCCTTCATCTTAT 

TTAGCTTACGGACACCCTTA TGCCTGCTCAGGAATTATTAATATCCTTCATCTTAT 

GTAGTTCTCTTCAACCTATGAGACGTTACGCTAAAGAAGATAATCGTAACGGAGAATAT 

TTAGCTTACGGACACCCTTA TGCCT GCTCAGGAAT T ATTAATATCCTTCATCTT AT 

TTAGCTTACGGACACCCTTA TGCCTGCTCAGGAATTATTAATATCCTTCATCTTAT 

GTAGTTCTCTTCAACCTATGAGACGTTATGCTAAAGAAGATAATCGTAACGGAGAATAT 

CCTTGATCTCGCATCCCTTCCATTGGTAAGATTACYTCTTCTAAATAGCCACCTTGTTT 
CCGTTGCTCAGTTTTCTCCTGACTCTTATGCTG — AAACTGTAATGTTAGAAGGGGCAC 
CCGTTGCTCAGTTTTCTCCTGACTCTTAKGCTG — AAACTGTAATGTTAGAAGGGGCAC 

CAGGCATTAAAATATAAAAATAAACCTATGGGC-CTAACTGCCATTGCAGGGGCA 

CAGGCATTAAAATATAAAAATAAACCTATGGGT-CTAACTGCCATTGCAGGGGCAG 

CAGTTGCTCAGTTTTCTCCTGACTCTTATGCTG — AAACTGTAATGTTAGAAGGGGCCC 
CAGTTGCTCAGTTTTCTCCTGACTCTTATGCTG — AAACTGTAATGTTAGAAGGGGCCC 
CAGGCATTAAAATATAAAAATAAACCTATGGGT-CTAACTGCCATTGCAGGGGCAGGA- 

CAGGCATTAAAATATAAAAATAAACCTATGGGTTCTAACTGC 

CCGTTGCTCAGTTTTCTCCTGACTCTTATGCTG — AAACTGTAATGTTAGA 

CAGGCATTAAAATATAAAAATAAACCTATGGGT-CTAACTGCCATTGCAGGGGC 

CAGGCATTAAAATATAAAAATAAACCTATGGGT-CTAACTGCCATTGCAGGGGCAGGA- 

TABCMARATVS TNC SRATNGTSAGTHAS 
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SE^^l 

SEQ1302 

SEQ1303 

SEQ1304 

SEQ1305 

SEQ1306 

SEQ1307 

SEQ1308 

SEQ1309 

SEQ1310 

SEQ1311 

SEQ1312 

SEQ1313 

SEQ1314 

SEQ1315 

SEQ1316 

SEQ1301 

SEQ1302 

SEQ1303 

SEQ1304 

SEQ1305 

SEQ1306 

SEQ1307 

SEQ1308 

SEQ1309 

SEQ1310 

SEQ1311 

SEQ1312 

SEQ1313 

SEQ1314 

SEQ1315 

SEQ1316 

SEQ1301 

SEQ1302 

SEQ1303 

SEQ1304 

8EQ1305 

SEQ1306 

SEQ1307 

SEQ1308 

SEQ1309 

SEQ1310 

SEQ1311 

SEQ1312 

SEQ13X3 

SEQ1314 

SEQ1315 

SEQ1316 



Table 13: 



% 



Comparative Sequences relating to SAG0466 (^^Plase) 



(l^Plase) 



GCTGTTAAGGCGCGTTTATGGCTCAAGAATGCCAATTTATCTAACATTTCTCTTCTAAA 
AAGAGT CTGT CAAAAATATGGTTTTAGAAGAGAAATGTTAGATAAATTGGCATTCTTGA 
AAGAGTCTGTCAAT^AATATGGTTTTAGAAGAGAAATGTTAGATAAATTGGCATTCTTGA 



GAGAGTCTGTCAAAAATATGGTTTTAGAAGAGAAATGTTAGATAAATTGGCATTCTTGA 



CC AT ATT TTT GACAGACTCTCTGGGCCCCTT — CTAACATTACAGTT TCAGCATAAGAG 
CCAT7VAACGCGCCTTAACAGCTAAACAAGGTGGCTATTTAGAAGAGGTAATCTTACCAA 
CCATAAACGCGCCTTAACAGCTAAACAAGGTGGCTATTTAGAAGAGGTAATCTTACCAA 



CCATAAACGCGCCTTAACAGCTAAACA- 



CAGGAGAAAACTGAGCAACTGTATATTCTCCGTTACGATTATCTTCTTTAGCATAACGT 
GGAAGGGATGCGAGATCAAGGCGTTAGAAAACTAAAAGAAGCATTTTTTCAAAAATTAC 
GGAAGGGATGCGAGATCAAGGCGTTAGAAAACTAAAAGAAGGATTTTTTCT^AAAATTAC 



6 



SEQ1302 

SEQ1303 

SEQ1304 

SEQ1305 

SEQ1306 

SEQ1307 

SEQ1308 

SEQ1309 

SEQ1310 . 

SEQ1311 

SEQ1312 

SEQ1313 

SEQ1314 

SEQ1315 

SEQ1316 

SEQ1301 

SEQ1302 

SEQ1303 

SEQ1304 

SEQ1305 

SEQ1306 

SEQ1307 

SEQ1308 

SEQ1309 

SEQ1310 

SEQ1311 

SEQ1312 

SEQ1313 

SEQ1314 

SEQ1315 

SEQ1316 

SEQ1301 
SEQ1302 
SEQ1303 
SEQ1304 
SEQ1305 
SEQ1306 
SEQ1307 
SEQ1308 
SEQ1309 
SEQ1310 
SEQX311 
SEQ1312 
SEQ1313 
SEQ1314 
, SEQ1315 
SEQ1316 



Table 13: Compara 



t 

ratlvi 



Ve Sequences relating to SAG0466 < 



TCATAGGTTGAAGAGAACTACTTTCAATCCCCCCAACAAGAACTTTTTCATTAATACCG 
AAGATTGATGGAAAATTCACCTTTGCTCACTATTGGAAATGTTTGTTTAATGCACGATG 
AAGATTGATGGRAAATTCACCTTTGCTCACTATTGGAAATGTTTGTTTAATGCACGATG 



TACTGATTTTTAGATAACCAAAAAAC — AAGGCAGAACTTGATGAAGCACACTGCATAT 
TGCTGCATTTCTAACGCTTCAGAGTCAGAAAACAGAATTCCGGATTGTTCACATTGTAG 
TGCTGCAT TTCTWACGCT TCAGAGTCAGAAAACAGAATTCCGG ATTGTTCACATTGT AG 



AATCGTTTGTACTGGAATATAGGATTCATAATCAGAAAAAAGAGTCATCAAACGACCAA 
AGTAGCAGGAGATCCCAAGCTTAGTCCAGAATTGGTTCACACGGCTACGGAAAAACTAT 
AGTAGCAGGAGATCCCAAGCTTAGTCCAGAATTGGTTCACACGGCTACGGA7VAAACTAT 
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SEQ1302 
SEQ1303 
SEQ1304 
SEQ1305 
SEQ1306 
SEQ1307 
SEQ1308 
SEQ1309 
SEQ1310 
SEQ1311 
SEQ1312 
SEQ1313 
SEQ1314 
SEQ1315 
SEQ1316 

SEQ1301 
SEQ1302 
SEQ1303 
SEQ1304 
SEQ1305 
SEQ1306 
SEQ1307 
SEQ1308 
SEQ1309 
SEQ13X0 
SEQ1311 
SEQ1312 
SEQ1313 
SEQ1314 
SEQX315 
SEQ1316 

SEQ1301 
SEQ1302 
SEQ1303 
SEQ1304 
SEQ1305 
SEQ1306 
SEQ1307 
SEQ1308 
SEQ1309 
SEQ1310 
SEQ1311 
SEQ1312 
SEQ1313 
SEQ1314 
SEQ1315 
SEQ1316 



*44ifc,£» .:^^fc r tin a y \ 

Table 13: Compara&ve Sequences relating to SAG0466 ^slase) 

-ATTGCCCCCAGTACCAACTGTGTTCCCACAAATAATACTATCAATGTTAGATTCTGATT 
AACAGAAACTCATACTAAAATATCGGATTATGATGCAATTGAATGGAATGAACCATTTG 
AACAGAAACTCATACTAAAATATCGGATTATGATGCAATTGAATGGAATGAACCATTTG 

TATTTTTTTTATTTGATTTAAAAGGTGTGCTCCTAAAAGTTCTGGACGGTAAGTTTAAA 
AGCGATTGATGCTTTATTTAATCATTATTATCCTGAAGAGAGAGAAAAATTCAATATTT 
AGCGATTGATGCTTTATTTAATCATTATTATCCTGAAGAGAGAGAAAAATTCAATATTT 

TGCTT 

TGGAGGGGCATTAGCTTACGGACACCCTTATGCCTGCTCAGGAATTA 

TGGAGGGGCATTAGCTTACGGACACCCTTATGCCTGCTCAGG 
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Table 14: Comparative Sequences relating to SAG0471 (glu^anase) 

SEQ ID NO. 1401: SAGO 471 FROM TOE 18RS21 GBS TYPE II STRAIN 

TTAAATTTGGTATCTTGACGCTTGAGGGAGAAGTACAAGAAAAATGGGCAATTGAGACCAATACTTTAGAAAACGGAAGACATATCGTT 
TCTGATATCGTTGAATCTCTCAAAC^TCGTTTGAGC^ 

AGCTGTTGATAGAACTAGTAAAAC^GTAACAGGTGCTTTTAATCTAAATTGGGCTGATACTCAAGAAGTAGGTTCAGTTATTGAAAAAG 

AAGTTGGAATTCCATTTTTTATTGATAACGATGCTAATGTTGCAGCACTTGGTGAACGCTGGGTAGGTGCTGGTGCCAATAATCCCGAC 

GTTGTTTTCGTAACCCTCGGAACAGGAGTAGGTGGAGGTGTTATCGCAGATGGTAACCTCATCCATGGTGTTGCAGGAGCAGGTGGAGA 

AATTGGGCATATGATTGTTGATCCAGAAAATGGATTTACGTGC^CATGTGGTAACAAAGGCTGCCTTGAGACAGTTGCATCAGCGACAG 

GTGTTGTTAGAGTAGCACGTCAACTCGCAGAACAATATGAGGGTTCGTCTGCCATTAAAGCAGCGATTGACACCGGTGATACTCTTACA 

AGTAAAGATATTTTTATAGCAGCAGAAGATGGGGATAAATTTGCTAATTCTGTTGTTGAACGTGTATCACGTTACCTTGGACTGGCAGC 

AGCTAATATTTCAAATATTTTAAACCCTGATTCTGTGGTTATTGGTGGCGGTGTCTCAGCAGCAGGTGAA 

AGAAATACTTTGTCACATTTGCTTTCCCACAAGTTAAAAAGTCAACTAAAATTAAGAT 

SEQ ID NO. 1402: SAG0471 FROM THE 090 GBS TYPE la STRAIN 

CGTTTCTGATATCGTTGAATCTCTCAAACATCGTTTGAGCCTCTATGGATTAACAAAAGATGACTTTCTCGGTATCGGTATGGGTTCTC 
CAGGAGCTGTTGATAGAACTAGTAAAACAGTAACAGGTGCTTTTAATCTAAATTGGGCTGATACTCAAGAAGTAGGTTCGGTTATTGAA 
AAAGAAGTTGGAATTCCATTTTTTATTGATAACGATGCTAATGTTGCAGCACTTGGTGAACGCTGGGTAGGTGCTGGTGCCAATAATCC 
CGATGTTGTTTTCGTAACCCTCGGAACAGGAGTAGGTGGAGGTGTTATCGCAGATGGTAACCTCATCCATGGTGTTGCAGGAGCAGGTG 
GAGAAATTGGGCATATGATTGTTGATCCAGAKAATGGATTTACGTGCACATGTGGTAACAAAGGCTGTCTTGAGACAGTTGCATCAGCG 
ACAGGTGTTGTTAGAGTAGCACGTCAACTCGCAGAACAATATGAAGGTTCGTCTGCCATTAAAGCAGCGATTGACAACGGTGATACTGT 
TACAAGTAAAGATATTTTTATAGCAGCAGAAGATGGGGATAAATTTGCTAATTCTGTTGTTGAACGTGTATCACGTTACCTTGGACTGG 
CAGCAGCTAATATTTCAAATATTTTAAACCCTGATTCTGTGGTTATTGGTGGCGGTGTCTCAGCAGCAGGTGAATTTTTACGTAGTCGC 

GTTGAGAAATACTTTGTCACATTTG 

SEQ ID NO. 1403: SAG0471 FROM THE COHl GBS TYPE la STRAIN 

ACAAGAAAAATGGGCAATTGAGACCAATACTTTAGAAAACGGAAGACATATCGTTTCTGATATCGTTGAATCTCTCAAACATCGTTTGA 

GCCTCTATGGATTAACAAAAGATGACTTTCTCGGTATCGGTATGGGTTCTCCAGGAGCTGTTGATAGAACTAGTAAAACA^ 

GCTTTTAATCTAAATTGGGCTGATACTCAAGA 

SEQ ID NO. 1404: SAG0471 FROM THE CJBUO GBS NONTYPEABLE STRAIN 

TTGGTATCTTGACGCTTGAGGAGAAGTACAAGAAAAATGGGCAATTGAGACCAATACTTTAGAAAACGGAAGACATATCGTTTCTGATA 
TCGTTGAATCTCTCAAACATCGTTTGAGCCTCTATGGATTAACAAAAGATGACTTTCTCGGTATCGGTATGGGGTCTCCAGGAGCTGTT 

GATAGAACTAGTAAAAC 

\ 

SEQ ID NO. 1405: SAG0471 FROM THE CJB110 GBS NONTYPEABLE STRAIN 

CACCAGCTAATATTTCAAATATTTTAAACCCTGATTCTGTGGTTATTGGTGGCGGTGTCTCAGCAGCAGGTGAATTTTTACGTAGTCGC 
GTTGAGAAATACTTTGTCACATTTGCTTTCCCACAAGTTAAAAAGTCAACTA 

SEQ ID NO. 1406: SAG0471 FROM THE 2603V/R GBS TYPE V STRAIN 

GGGCAATTGAGACCAATACTTTAGAAAACGGAAGACATATCGTTTCTGATATCGTTGAATCTCTCAAACATCGTTTGAGCCTCTATGGA 

TTAACAAAAGATGACTTTCTCGGTATCGGTATGGGTTCTCCAGGAGCTG 

SEQ ID NO. 1407: SAG0471 FROM THE H36b GBS TYPE lb STRAIN 

GGCAATTGAGACCAATACTTTAGAAAACGGAAGACATATCGTTTCTGATATCGTTGAATCTCTCAAACATCGTTTGAGCCTCTATGGAT 
TAACAAAAGATGACTTTCTCGGTATCGGTATGGGTTCTCCAGGAGCTGTTGATAGAACTAGTAAAACAGTAACAGGTGCTTTTAATCTA 
AATTGGGCTGATACTCAAGAAGTAGGTTCAGTTATTGAAAAAGAAGTTGGAATTCCATTTTTTATTGATAACGATGCT 
ACTTGGTGAACGCTGGGTAGGTGCTGGTGCCAATAATCCCGACGTTGTTTTCGTAACC 

SEQ ID NO. 1408: SA60471 FROM THE H36 GBS TYPE lb STRAIN (REVERSE COMPLEMENT) 

GAGACAGTTGC7VTCAGCGACAGGTGTTGTTAGAGTAGCACGTCAACTCGCAGAACAATATGAGGGTTCGTCTGCCATTAAAGCAGCGAT 
TGACAACGGTGATACTGTTACAAGTAAAGATATTTTTATAGCAGCAGAAGATGGGGATAAATTTGCTAATTCTGTTGTTGAACGTGTAT 
CACGTTACCTTGGACTGGCAGCAGCTAATATTTCAAATATTTTAAACCCTGATTCTGTGGTTATTGGTGGCGGTGTCTCAGCAGCAGGT 
GAATTTTTACGTAGTCGCGTTGAGAAATACTTTGTCACATTTGCTTTCCCACA 

SEQ ID NO. 1409: SAG0471 FROM THE M732 GBS TYPE III STRAIN 

ACAAGAAAAATGGGCAATTGAGACCATACTTAGAAAACGGAAGACATATCGTTTCTGATATCGTTGAATCTCTCAAACATCGTTTGAGC 

ctctatggattaacaaaagatgactttctcggtatcggtatgggttctccaggagctgttgatagaactagtaaaacagtaacaggtgc 

TTTTAATCTAAATTGGGCTGATACTCAAGAAGTAGGTTCGGTTATTGAAAAAGAAGTTGGAATTCCATTTTTTATTGATAACGATGCTA 
ATGTTGCAGCACTTGGTGAACGCTGGGTAGGTGCTGGTGCCA^^ 

GGTGTTATCGCAGATGGTAACCTCATCCATGGTGTTGCAAGAGCAGGTGGAGAAATTGGGCATATGATT 

SEQ ID NO. 1410: SAGO 471 FROM THE M732 GBS TYPE III STRAIN (REVERSE COMPLEMENT) 

CAGCAGCAGGTGAATTTTTACGTAGTCGCGTTGAGAAATACTTTGTC^ 
ATTGCTGAACTAGGTAATGAT 

SEQ ID NO. 1411: SAGO 471 FROM THE M781 GBS TYPE III STRAIN 
AGAAGTACAAGAAAATGGGCAATTGAGACCATACTTAGAAAACGGAAGACA 

TGAGCCTCTATGGATTAACAAAAGATGACTTTCTCGGTATCGGTATGGGTTCTCCAGGAGCTGTTGATAGAACTAGTAAAACAGTAACA 
GGTGCTTTTAATCTAAATTGGGCTGATACTCAAGAAGTAGGTTCGGTTATTGAAAAAGAAGTTGGAATTCCATTTTTTATTGATAACGA 
TGCTAATGTTGCAGCACTTGGTGAACGCTGGGTAGGTGCTGGTGCCAATAATCCCGATGTTGTTTTCGTAACCCT 



1 



% ,...,.,.,..*.„,,.,.. 

Table 14: Comparative Sequences relating t SAG0471 (glucokinase) 

_ „ rt .,„. sas0471 from the m781 gbs type iii strain (reverse complement) 
™t^g™caagtWga^^ 

GTAGTCGCGTTGAGAAATACTTTGTCACATTTGCTTTCCCACAAGTTAAAAA 
TTTCGTAACCCTCGGAACAGGAGTAGGTGGAGG 




_ „- SAG0471 FROM THE JM9130013 GBS TYPE VIII STRAIN (REVERSE COMPLEMENT) 

GTTATCGCAGATGGTAACCTCATCCATO 

rTrMCMGTGGTAACAAAGGCTGCCTTGAGACAGTTGCATCAGCGA 



TTTGCTAACTCTGTTGT^CGTGTATCACCT^^ 
TAOTGGTGGCGGTCTCTCAGCAGCAGGTGAATTTTTACCT 

AGTCAACTAA 
TTCCATTTTTTATTG 

m w ft 1A17. SAG0471 FROM THE 2603V/R TYPE V GBS STRAIN (REVERSE COMPLEMENT) 

ftGCRGCTAATATTTCAAATA 

TTGAGAAATACTTTGTCACATTTGTTTTCCCACAAGGT 

SEQ1401_ ~ Z~ZZ~Z 

SEQ1402 " ~ 

SEQ1403 _~ Z ZZZZ 

SEQ1404 ~ ~~ 

SEQ1405 ZZZZZZZ 

SEQ1406 ~~ "I" 

SEQ1407 ~ ZZZZZZZZZ 

SEQ1408 ~ _ZZZZ 

SEQ1409 ~ ~~ 

SEQ1410 _ " 

SEQ1411 ~~ 

SEQ1412 ~ "I"" 

seqi413 — ZZZZZZZZZZZZZ-Z ZZZ 

SEQ1415 TTATCGCAGAT^TAACCTCATCCATGGTGTTGCAGGAGCAGG^ 

SEQ1416 _ZZ-ZZZ I_ 

SEQ1417 



SEQ1401_ _ 

SEQ1402 ~ ~ 

SEQ1403 ~ Z~_ I " 

SEQ1404 _ ~~ _ 31 

SEQ1405 " 2 

SEQ1406 " "~ 

seqi407 ZZZZZ"_Z__ZZZZZZZZZZZZ gag 

SEQ1408 " _ 

SEQ1409 " ~ _ ~~ 

SEQ1410 --- 

SEQ1411 " " ZZZZ 

SEQ1412 ~ ZZZJZ 

SEQ1413 ~ ~™ Z—ZZZZZ Z 

SEQ1415 T^TTGCTGATCCAGAAAATGGATTTACGTGCACATGTGGTAACAAAGGCT 

SEQ1416 ~ ~ ZZZZZ 

SEQ1417 : 
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Table 14: C reparative Sequences relating to SAG0471 (glue 



1 

LC Ion 



ase) 



SEQl401_ 

SEQ1402~ 

SEQ1403 

SEQ1404 

SEQ1405 

SEQ1406 

SEQ1407 

SEQ1408 

SEQ1409 

SEQ1410 

SEQ1411 

SEQ1412 

SEQ1413 

SEQ1414 

SEQ1415 

SEQ1416 

SEQ1417 

SEQ1401_ 

SEQ1402 

SEQ1403 

SEQ1404 

SEQ1405 

SEQ1406 

SEQ1407 

SEQ1408 

SEQ1409 

SEQ1410 

SEQ1411 

SEQ1412 

SEQ1413 

SEQ1414 

SEQ1415 

SEQ1416 

SEQ1417 

SEQ1401_ 

SEQ1402~ 

SEQ1403 

SEQ1404 

SEQ1405 

SEQ1406 

SEQ1407 

SEQ1408 

SEQ1409 

SEQ1410 

SEQ1411 

SEQ1412 

SEQ1413 

SEQ1414 

SEQ1415 

SEQ1416 

SEQ1417 



CAGTTGCATCAGCGACAGGTGTTGTTAGAGTAGCACGTCAACTCGCAGAACAATATGAG 



CAGTTGCATCAGCGACAGGTGTTGTTAGAGTAGCACGTCAACTCGCAGAACAATATGAG 



GTTCGTCTGCCATTAAAGCAGCGATTGACAACGGTGATACTGTTACAAGTAAAGATATT 



GATACTGTTACAAGTAAAGATATT 



GTGATACT GTT ACAAGT AAAGATATT 

GTTCGTCTGCCATTi^AGCAGCGATTGACCACGGTGATACTGTTACAAGTAAAGATATT 



TTAAATTTGGTATCTTGACGCTTGAGGGAGAAGTACAA 



ACAA 

-TTGGTATCTTGACGCTTGAGG-AGAAGTACAA 



TTATAGCAGCAGAAGATGGGGATAAATTTGCTAATTCTGTTGTTGAACGTGTATCACGT 
Z „ ACAA 



AGAAGTACAA 

TTATAGCAGCAGAAGATGGGGATAAATTTGCTAATTCTGTTGTTGAACGTGTATCACGT 

AAATTTGGTATCTTGACGCTTGAGGGAGAAGTACAA 

TTATAGCAGCAGAAGATGGGGATAAATTTGCTAATTCTGTTGTTGAACGTGTATCACGT 
TTATAGCAGCAGAAGATGGGGATAAATTTGCTAATTCTGTTGTTGAACGTGTATCACGT 
TGGTATCTTGACGCTTGAGGGAGAAGTACAA 
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Table 14: Comparative Sequences relating to SAG0471 (glucoEnase) 



| 

i colon 2 



.133 ^i2*f 



SEQ1401_ 

SEQ1402 

SEQ1403 

SEQ1404 

SEQ1405 

SEQ1406 

SEQ1407 

SEQ1408 

SEQ1409 
SEQ1410 
" SEQ1411 
SEQ1412 
SEQ1413 
SEQ1414 
SEQ1415 
SEQ1416 
SEQ1417 

SEQ1401_ 

SEQ1402 

SEQ1403 

SEQ1404 

SEQ1405 

SEQ1406 

SEQ1407 

SEQ1408 

SEQ1409 

SEQ1410 

SEQ1411 

SEQ1412 

SEQ1413 

SEQ1414 

SEQ1415 

SEQ1416 

SEQ14X7 

SEQ1401_ 

SEQ1402 

SEQ1403 

SEQ1404 

SEQ1405 

SEQ1406 

SEQX407 

SEQ1408 

SEQ1409 

SEQ1410 

SEQX411 

SEQ1412 

SEQ1413 

SEQ1414 

SEQ1415 

SEQ1416 

SEQ1417 



AAAAATGGGCAATTGAGACCAATACTTTAGAAAACGGAAGACATATCGTTTCTGATATC 

CGTTTCTGATATC 

AAAAATGGGCAATTGAGACCAATACTTTAGAAAACGGAAGACATATCGTTTCTGATATC 
AAAAATGGGCAATTGAGACCAATACTTTAGAAAACGGAAGACATATCGTTTCTGATATC 

CACCAGCTAATATTTCAAATATTTTAAACCCTGATTCTGTGGTTATT 

GGGCAATTGAGACCAATACTTTAGAAAACGGAAGACATATCGTTTCTGATATC 

GGCAATTGAGACCAATACTTTAGAAAACGGAAGACATATCGTTTCTGATATC 

ACCTTGGACTGGCAGCAGCTAATATTTCAAATATTTTAAACCCTGATTCTGTGGTTATT 
AAAAATGGGCAATTGAGACCA-TACTT-AGAAAACGGAAGACATATCGTTTCTGATATC 

^^Itgggcaattgagacca-tactt--agaaaacggaagacatatcgtttctgatatc 

ACCTTGGACTGGCAGCAGCTAATATTTCAAATATTTTAAACCCTGATTCTGTGGTTATT 
AAAAATGGGCA-TTGAGACCA-TACTT-AGAAAACGGAAGACATATCGTTTCTGATATC 
ACCTTGGACTGGCAGCAGCTAATATTTCAAATATTTTAAACCCTGATTCTGTGGTTATT 
ACCTTGGACTGGCAGCAGCTAATATTTCAAATATTTTAAACCCTGATTCTGTGGTTATT 
AAAAATGGGCAATTGAGACCA-TACTT-AGAAAACGGAAGACATATCGTTTCTGATATC 
AGCAGCTAATATTTCAAATATTTTAAACCCTGATTCTGTGGTTATT 

TTGAATCTCTCA-AACATCGTTTGAGCCTCTATGGATTAACAAAAGATGACTTTCTCGG 
TTGAATCTCTCA-AACATCGTTTGAGCCTCTATGGATTAACAAAAGATGACTTTCTCGG 
TTGAATCTCTCA-AACATCGTTTGAGCCTCTATGGATTAACAAAAGATGACTTTCTCGG 
TTGAATCTCTCA-AACATCGTTTGAGCCTCTATGGATTAACAAAAGATGACTTTCTCGG 

GTGGCGGTGTCTCAGCAGCAGGTGAATTTTTACGTAGTCGCGTTGAGAAATACTTT 

TTGAATCTCTCA— AACATCGTTTGAGCCTCTATGGATTAACAAAAGATGACTTTCTCGG 
TTGAATCTCTCA-AACATCGTTTGAGCCTCTATGGATTAACAAAAGATGACTTTCTCGG 

GTGGCGGTGTCTCAGCAGCAGGTGAATTTTTACGTAGTCGCGTTGAGAAATACTTT 

TTGAATCTCTCA— AACATCGTTTGAGCCTCTATGGATTAACAAAAGATGACTTTCTCGG 

CAGCAGCAGGTGAATTTTTACGTAGTCGCGTTGAGAAATACTTT 

TTGAATCTCTCA-AACATCGTTTGAGCCTCTATGGATTAACAAAAGATGACTTTCTCGG 

GTGGCGGTGTCTCAGCAGCAGGTGAATTTTTACGTAGTCGCGTTGAGAAATACTTT 

TTGAATCTCTCA— AACATCGTTTGAGCCTCTATGGATTAACAAAAGATGACTTTCTCGG 

GTGGCGGTGTCTCAGCAGCAGGTGAATTTTTACGTAGTCGCGTTGAGAAATACTTT 

GTGGCGGTGTCTCAGCAGCAGGTGAATTTTTACGTAGTCGCGTTGAGAAATACTTT 

TTGAATCTCTCA-AACATCGTTTGAGCCTCTATGGATTAACAAAAGATGACTTTCTCGG 
GTGGCGGTGTCTCAGCAGCAGGTGAATTTTTACGTAGTCGCGTTGAGAAATACTTT 

ATCGGTATGGGTTCTCCAGGAGCTGTTGATAGAACTAGTAAAACAGTAACAGGTGCTTT 

ATCGGTATGGGTTCTCCAGGAGCTGTTGATAGAACTAGTAAAACAGTAACAGGTGCTTT 

ATCGGTATGGGTTCTCCAGGAGCTGTTGATAGAACTAGTAAAACAGTAACAGGTGCTTT 

ATCGGTATGGGGTCTCCAGGAGCTGTTGATAGAACTAGTAAAAC 

GTCACATTTGCTTTCCCACAAGTTAAAAAGTCAACTA 

ATCGGTATGGGTTCTCCAGGAGCTG "ZZZZZZZZ 

ATCGGTATGGGTTCTCCAGGAGCTGTTGATAGAACTAGTAAAACAGTAACAGGTGCTTT 

GTCACATTTGCTTTCCCACA IIIIIZZ 

ATCGGTATGGGTTCTCCAGGAGCTGTTGATAGAACTAGTAAAACAGTAACAGGTGCTTT 

GTCACATTTGCTTTCCCACAAGTTAAAAAGTCAACTAAAATTAAGATTGCTGAACTAGG 

ATCGGTATGGGTTCTCCAGGAGCTGTTGATAGAACTAGTAAAACAGTAACAGGTGCTTT 

GTCACATTTGCTTTCCCACAAGTTAAAAA 

ATCGGTATGGGTTCTCCAGGAGCTGTTGATAGAACTAGTAAAACAGTAACAGGTGCTTT 

ATCACATTTGCTTTCCCACAAGTTAAAAAGTCAACTAAAATTAAGATTG 

GTCACATTTGCTTTCCCACAAGTTAAAAAGTCAACTAA 

ATCGGTATGGGTTCTCCAGGAGCTGTTGATAGAACTAGTAAAACAGTCACAGGTGCTTT 

GTCACATTTGTTTTCCCACAAGGT 
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Table 14: Comparathr Sequences relating to SAG0471 (glucoTQnase) 



I 

lcoRin 



SEQ1401_ 

SEQ1402 

SEQ1403 

SEQ1404 

SEQX405 

SEQ1406 

SEQX407 

SEQ1408 

SEQ1409 

SEQ1410 

SEQ1411 

SEQ1412 

SEQ1413 

SEQ1414 

SEQ1415 

SEQ1416 

SEQ1417 

SEQ1401_ 

SEQ1402~ 

SEQ1403 

SEQ1404 

SEQ1405 

SEQ1406 

SEQ1407 

SEQ1408 

SEQ1409 

SEQ1410 

SEQ1411 

SEQ1412 

SEQ1413 

SEQ1414 

SEQ1415 

SEQ1416 

SEQ1417 

SEQ1401_ 

SEQ1402 

SEQ1403 

SEQ1404 

SEQ1405 

SEQ1406 

SEQ1407 

SEQ1408 

SEQ1409 

SEQ1410 

SEQ1411 

SEQ1412 

SEQ1413 

SEQ1414 

SEQ1415 

SEQ1416 

SEQ1417 



AATCTAAATTGGGCTGATACTCAAGAAGTAGGTTCAGTTATTGAAAAAGAAGTTGGAAT 
AATCTAAATTGGGCTGATACTCAAGAAGTAGGTTCGGTTATTGAAAAAGAAGTTGGAAT 
AATCTAAATTGGGCTGATACTCAAGA 



AATCTAAATTGGGCTGATACTCAAGAAGTAGGTTCAGTTATTGAAAAAGAAGTTGGAAT 

AATCTAAATTGGGCTGATACTCAAGAAGTAGGTTCGGTTATTGAAAAAGAAGTTGGAAT 

AATGAT "" — — ~ "~ 

AATCTAAATTGGGCTGATACTCAAGAAGTAGGTTCGGTTATTGAAAAAGAAGTTGGAAT 

AATCTAAATTGGGCTGATACTCAAGAAGTAGGTTCAGTTATTGAAAAAGAAGTTGGAAT 

AATCTA^TTGGGCTGATACTCAAGAAGTAGGTTCAGTTATTGAAAAAGAAGCTGGAAT 

CCATTTTTTATTGATAACGATGCTAATGTTGCAGCACTTGGTGAACGCTGGGTAGGTGC 
CCATTTTTTATTGATAACGATGCTAATGTTGCAGCACTTGGTGAACGCTGGGTAGGTGC 



CCATTTTTTATTGATAACGATGCTAATGTTGCAGCACTTGGTGAACGCTGGGTAGGTGC 
CCATTTTTTATTGATAACGATGCTAATGTTGCAGCACTTGGTGAACGCTGGGTAGGTGC 
C^TOTTTTATTGATAACGATGCTAATGTTGCAGCACTTGGTGAACGCTGGGTAGGTGC 
CCATTTTTTATTGATAACGATGCTAATGTTGCAGCACTTGGTGAACGCTGGGTAGGTGC 



CCATTTTTTATTG 

GGTGCCAATAATCCCGACGTTGTTTTCGTAACCCTCGGAACAGGAGTAGGTGGAGGTGT 
GGTGCCAATAATCCCGATGTTGTTTTCGTAACCCTCGGAACAGGAGTAGGTGGAGGTGT 



GGTGCCAATAATCCCGACGTTGTTTTCGTAACC ~" ZJ1ZZ 

GGTGCCAATAATCCCGATGTTGTTTTCGTAACCCTCGGAACAGGAGTAGGTGGAGGTGT 



GGTGCCAATAATCCCGATGTTGTTTTCGTAACCCTCGGAACAGGAGTA- 



GGTGCCAATAATCCCGACGTTGTTTTCGTAACCCTCGGAACAGGAGTAGGTGGAGG- 
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Table 14: Comparative Sequences relating to SAG0471 (glucokinase) 
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SEQX40X_ 

SEQX402 

SEQ1403 

SEQ1404 

SEQ1405 

SEQ1406 

SEQ1407 

SEQ1408 
SEQ1409 
SEQ1410 
SEQ1411 
SEQ14X2 
SEQ1413 
SEQ1414 
SEQ1415 
SEQ1416 
SEQ1417 

SEQX40X_ 

SEQ1402 

SEQ1403 

SEQ1404 

SEQ1405 

SEQ1406 

SEQ1407 

SEQ1408 

SEQX409 

SEQ1410 

SEQX411 

SEQ1412 

SEQ1413 

SEQX4X4 

SEQX4X5 

SEQX416 

SEQX4X7 

SEQX40X_ 

SEQ1402 

SEQX403 

SEQX404 

SEQ1405 

SEQX406 

SEQ1407 

SEQ1408 

SEQX409 

SEQX4X0 

SEQX4XX 

SEQX4X2 

SEQX4X3 

SEQ1414 

SEQX4X5 

SEQX416 

SEQ14X7 



ATCGCAGATGGTAACCTCATCCATGGTGTTGCAGGAGCAGGTGGAGAAATTGGGCATAT 
ATCGCAGATGGTAACCTCATCCATGGTGTTGCAGGAGCAGGTGGAGAAATTGGGCATAT 



ATCGCAGATGGTAACCTCATCCATGGTGTTGCAAGAGCAGGTGGAGAAATTGGGCATAT 



ATTGTTGATCCAGAAAATGGATTTACGTGCACATGTGGTAACAAAGGCTGCCTTGAGAC 
ATTGTTGATCCAGAKAATGGATTTACGTGCACATGTGGTAACAAAGGCTGTCTTGAGAC 



ATT — 



GTTGCATCAGCGACAGGTGTTGTTAGAGTAGCACGTCAACTCGCAGAACAATATGAGGG 
GTTGCATCAGCGACAGGTGTTGTTAGAGTAGCACGTCAACTCGCAGAACAATATGAAGG 
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Table 14: C mparative Sequences relating to SAG0471 (glue kinase) 



SEQ1401_ 

SEQ1402 

SEQ1403 

SEQ1404 

SEQ1405 

SEQ1406 

SEQ1407 

SEQ1408 

SEQ1409 

SEQ1410 

SEQ1411 

SEQ1412 

SEQ1413 

SEQ1414 

SEQ1415 

SEQ1416 

SEQ1417 

SEQ1401_ 

SEQ1402 

SEQ1403 

SEQ1404 

SEQ1405 

SEQ1406 

SEQ1407 

SEQ1408 

SEQ1409 

SEQ1410 

SEQX411 

SEQ1412 

SEQ1413 

SEQ1414 

SEQ1415 

SEQ1416 

SEQ1417 

SEQ1401_ 

SEQ1402 

SEQ1403 

SEQ1404 

SEQ1405 

SEQ1406 

SEQ1407 

SEQ1408 

SEQ1409 

SEQ14X0 

SEQ1411 

SEQ1412 

SEQ1413 

SEQ1414 

SEQ1415 

SEQ1416 

SEQ1417 



TCGTCTGCCATTAAAGCAGCGATTGACACCGGTGATACTGTTACAAGTAAAGATATTTT 
TCGTCTGCCATTAAAGCAGCGATTGACAACGGTGATACTGTTACAAGTAAAGATATTTT 



ATAGCAGCAGAAGATGGGGATAAATTTGCTAATTCTGTTGTTGAACGTGTATCACGTTA 
ATAGCAGCAGAAGATGGGGATAAATTTGCTAATTCTGTTGTTGAACGTGTATCACGTTA 



CTTGGACTGGCAGCAGCTAATATTTCAAATATTTTAAACCCTGATTCTGTGGTTATTGG 
CTTGGACTGGCAGCAGCTAATATTTCAAATATTTTAAACCCTGATTCTGTGGTTATTGG 
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SEQ1401_ 

SEQ1402 

SEQ1403 

SEQ1404 

SE01405 

SEQ1406 

SEQ1407 

SEQ1408 

SEQ1409 

SEQ1410 

SEQ1411 

SEQ1412 

SEQ1413 

SEQ1414 

SEQ1415 

SEQ1416 

SEQ1417 

SEQ1401_ 

SEQ1402 

SEQ1403 

SEQ1404 

SEQ1405 

SEQ1406 

SEQ1407 

SEQ1408 

SEQ1409 

SEQ1410 

SEQ1411 

SEQ1412 

SEQ1413 

SEQ1414 

SEQ1415 

SEQ1416 

SEQ1417 



Table 14: Comparativ Sequences relating t SAG0471 (glue kinase) 

GGCGGTGTCTCAGCAGCAGGTGAATTTTTACGTAGTCGCGTTGAGAAATACTTTGTCAC 
GGCGGTGTCTCAGCAGCAGGTGAATTTTTACGTAGTCGCGTTGAGAAATACTTTGTCAC 



TTTGCTTTCCCACAAGTTAAAAAGTCAACTAAAATTAAGAT 
TTTG 



t 1 

Table 15: C mparative Sequences relating to SAG0492 

SEO TD NO 1501: SAG0492 FROM THE 1169OT1 GBS NONTYPEABLB STRAIN _ mmm ^ n ^ M „. A « rTCT 
TG^COTG^GT^CG^TCGTGTCRTTTTTATGGATGCAGGCATTATTGTGAGCAAGGGACCCCTAAGGftAGTAT 

tcaacatttttaagaacaatgaatc 
tgatatttttaaaatgcgcgaaaai 
tatcacctattaagacaaaggggc: 

tggttattotctctStgaaatgggttttgcacgtgaagtagcggatcgtgtcatt^ 

■Ha 

AGTAT 

SBQ XD NO. 1505: SAG0492 FROM THE 090 GBS TYPE la STRAIN 



TTGGGAAAAATGAGCTTTTARAAGGCMTG 
lAAj 



TGA ™ rTOAGAC^GGGGCT^^^ 





TGTTATGCAAGATTT AGCTAAAT^liati 1 «.x otx*^**^* — — - — 

TTTTTATGGATGCAGGCATTATTGTTgAsCAAGGGACCCCTAAGGAAGTA 

a*TOTAGCTAAATCTGGTATGACGATGGTTATTGTCACTCATGAAATGGGTTTTGCACGTGAAGTAGCGGATCGTGT 1 1 1 a» 



AAGCTTGATGCTCAGACAAAAGCATACGAGCTACTTGAAAAAGrEGGACTCAAAGAG^ 
GCACGTGAAGTAGCGGATCGTGTCTTTTTATGGATGCGGGAATTATTGTGAGCAAGGGACC 
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Table 15: Comparative Sequences relating to SAG0492 

Qpn tt\ -Km n cnp . o AGO 4 92 FROM THE H36b GBS TYPE lb STRAIN 

ATG^GGmTA^G^CATTCACT^ 

ttamaacaatgaat^^ 
taaaatgcgcg^ 

•rTAAGMAAAGGGGCTTTCT 

acctactt^ct^atcS 

GAAGTAT 

SEQ ID NO. 1509: SAGO 4 92 FROM THE JM9130013 GBS TYPE VIII STRAIN (REVERSE COMPLEMENT) 

GGTTTTAAAAGGC^TTGACTTGGATATTCATCAAGGAGAAGTAGTGGTTATTATTGGCCCTTCTGGCTCTGGTAAGTCAACAT^TTAA 

GAACAATGAATCTCTTGGAAGTACCAACAAAGGGAACAGTGACTTTTGAAGGGATTGATATAACAGACAAAAAGAATGA 
ATGCGCGAAAAAATGGGCATGGTTTTTCAACAGT^ 

gacaaaotggctttctaagcttgatgctcag^ 

CAGCTAGCTTATCTGGAGGACAACAACAACGAATTGCTATTGCARGAGGTCTTGCAATGAATCCTGATGTCCT^ 

aStcagctcttgItcctgaaa^^ 

TCATGAAATGGGTTTTGCACGTGAAGTAGCGGATCGTGTCATTTTTATGGATGC^GGAATTATTGTTGAGCAAGGGGCCCCT 
TATTTAGCAAAACAAAAGAAAT 

SEQ ID NO. 1510: SAG0492 FROM THE M732 GBS TYPE III STRAIN „„„„„„ 
GGTGCTTATTATTGGCCCTTCTGGCTCTGGTAAGTCAACATTTTTAAGAACAATGAATCTCTTGGAAGTACCAACAAAGGG^CAGTGA 

CTTTTGAAGGGATTGATATAACAGACAAAAAGAATGATATTTTTAAAATGCGCGAAAAAATGGG 

TTTCCCAATATGACTGTACTAGAAAATATTACTTTATCACCTATTAAGACAAAGGGACTTTCTAAGCTTGATGCTCAGACAAAAGCATA 
CGAGCTACTTGAAAAAGTTGGACTCAAAGAGAAGGCTAATGCTTATCCAGCAAGCTTATCTGG 

SEQ ID NO 1511: SAG0492 FROM THE COH1 GBS TYPE la STRAIN 

ATTGACTTGGATATTCATCAAGGAGAAGTGGTGGTTATTATTGGCCCTTCTGGCTCTGGTAAGTCAACA 
CTTGGAAGTACCAACAAAGGGAACAGTGACTTTTGAAGGGATTGATATAACAGACAAAAAGAATGAT^ATTTTT 

TGGGCATGGTTTTTCAACAGTTCAATCTATTTCCCAATATGACTGTACTAGAAAATATTACTTTATCACCTATTAAGACAAAGGGACT 

TCTAAGCTTGATGCTCAGACAAAA^ 
TGG 
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% 



Table 15: C mparativ Sequences relating to SAG04 1 



% 

0492 



SEQ1501 
SEQ1502 
SEQ1503 
SEQ1504 
SEQ1505 
SEQ1506 
SEQ1507 
SEQ1508 
SEQ1509 
SEQ1510 
SEQ1511 

SEQ1501 
SEQ1502 
SEQ1503 
SEQ1504 
SEQ1505 
SEQ1506 
SEQ1507 
SEO1508 
SEQ1509 
SEQ1510 
SEQ1511 

SEQ1501 
SEQ1502 
SEQ1503 
SEQ1504 
SEQ1505 
SEQ1506 
SEQ1507 
SEQ1508 
SEQ1509 
SEQ1510 
SEQ1511 

SEQ1501 
SEQ1502 
SEQ1503 
SEQ1504 
SEQ1505 
SEQ1506 
SEQ1507 
SEQ1508 
SEQ1509 
SEQ1510 
SEQ1511 

SEQ1501 
SEQ1502 
SEQ1503 
SEQ1504 
SEQ1505 
SEQ1506 
SEQ1507 
SEQ1508 
SEQ1509 
SEQ1510 
SEQ1511 



TGACTTGG 

TTGGGAAAAATGAGGTTTTAAAAGGCATTGACTTGG 

AAAAATGAGGTTTTAAAAGGCATTGACTTGG 

GAGGTTTTAAAAGGCATTGACTTGG 

AATACAAGGACTTCATAAAAGTTTTGGGAAAAATG AGGT TT TAAAAGGCATT GACTTGG 

GACTTGG 

ATGAGGTTTTAAAAGGCATTGACTTGG 

GGTTTTAAAAGGCATTGACTTGG 

ATTGACTTGG 

TATTCATCAAGGAGAAGTGGTGGTTATTATTGGCCCTTCTGGCTCTGGTAAGTCAACAT 
TATTCATCAAGGAGAAGTAGTGGTTATTATTGGCCCTTCTGGCTCTGGTAAGTCAACAT 
TATTCATCAAGGAGAAGTAGTGGTTATTATTGGCCCTTCTGGCTCTGGTAAGTCAACAT 
TATTCATCAAGGAGAAGTGGTGGTTATTATTGGCCCTTCTGGCTCTGGTAAGTCAACAT 

TGGTGGTTATTATTGGCCCTTCTGGCTCTGGTAAGTCAACAT 

TATTCATCAAGGAGAAGTAGTGGTTATTATTGGCCCTTCTGGCTCTGGTAAGTCAACAT 
TATTCATCAAGGAGAAGTGGTGGTTATTATTGGCCCTTCTGGCTCTGGTAAGTCAACAT 
TATTCATCAAGGAGAAGTAGTGGTTATTATTGGCCCTTCTGGCTCTGGTAAGTCAACAT 
TATTCATCAAGGAGAAGTAGTGGTTATTATTGGCCCTTCTGGCTCTGGTAAGTCAACAT 

GGTGGTTATTATTGGCCCTTCTGGCTCTGGTAAGTCAACAT 

TATTCATCAAGGAGAAGTGGTGGTTATTATTGGCCCTTCTGGCTCTGGTAAGTCAACAT 

TTTTAAGAACAATGAATCTCTTGGAAGTACCAACAAAGGGAACAGTGACTTTTGAAGGAA 
TTTTAAGAACAATGAATCTCTTGGAAGTACCAACAAAGGGAACAGTGACTTTTGAAGGGA 
TTTTAAGAACAATGAATCTCTTGGAAGTACCAACAAAGGGAACAGTGACTTTTGAAGGGA 
TTTTAAGAACAATGAATCTCTTGGAAGTACCAACAAAGGGAACAGTGACTTTTGAAGGGA 
TTTTAAGAACAATGAATCTCT TGGAAGT ACCAACAAA.GGGAACAGTGACTTT TGAAGGGA 
TTTTAAGAACAAT GAATCTCTTGGAAGTACCAACAAAGGG AACAGTGACTTTT GAAGGGA 
T TTTAAGAACAATGAATCTCT TGGAAGTACCAACAAAGGGAACAGTGACTTTTGAAGGGA 
TTTTAAGAACAATGAATCTCTTGGAAGTACCAACAAAGGGAACAGTGACTTTTGAAGGGA 
TTTTAAGAACAATGAATCTCTTGGAAGTACCAACAAAGGGAACAGTGACTTTTGAAGGGA 
TTTTAAGAACAATGAATCTCTTGGAAGTACCAACAAAGGGAACAGTGACTTTTGAAGGGA 
TT TTAAGAACAATGAATCT CT TGG AAGTACCAACAAAGGGAACAGTGACTTTTGAAG GGA 

TTGATATAACAGACAAAAAAAATGATAT TT TTAAAATGCGCGAAAAAATGGGC ATGGTT T 
TTGATATAACAGACAAAAAGAATGATATTTTTAAAATGCGCGAAAAAATGGGCATGGTTT 
TTGATATAACAGACAAAAAGAATGATATTTTTAAAATGCGCGAAAAAATGGGCATGGTTT 
TTGATATAACAGACAAAAAGAATGATATTTTTAAAATGCGCGAAAAAATGGGCATGGTTT 
TTGATATAACAGACAA7UU\GAATGATATTTTTAAAATGCGCGAAAAAATGGGCATGGTTT 
TTGATATAACAGACAAAAAGAATGATATTTTTAAAATGCGCGAAAAAATGGGCATGGTTT 
TTGATATAACAGACAAAAAGAATGATATTTTTAAAATGCGCGAAAAAATGGGCATGGTTT 
TTGATAT AACAG ACAAAAAGAATGATAT T TT TAAAATGCGCGAAAAAATGGGCATGGTT T 
TTGATATAACAGACAAAAAGAATGATATTTTTAAAATGCGCGA7^AAAATGGGCATGGTTT 
TTGATATAACAGACAAAAAGAATGATATTTTTAAAATGCGCGAAAAAATGGGCATGGTTT 
TTGATATAACAGACAAAAAGAATGATATTTTTAAAATGCGCGAAAAAATGGGCATGGTTT 

TTCAAC AGT TCAATCTAT TT CCCAATATGACTGT ACT AGAAAATATT ACTT T ATCACCT A 
TTCAACAGTTCAATCTATTTCCCAATATGACTGTACTAGAAAATATTACTTTATCACCTA 
TTCAACAGTTCAATCTATTTCCCAATATGACTGTACTAGAAAATATTACTTTATCACCTA 
TTCAACAGTTCAATCTATTTCCCAATATGACTGTACTAGAAAATATTACTTTATCACCTA 
TTCAACAGTTCAATCTATTTCCCAATATGACTGTACTAGAAAATATTACTTTATCACCTA 
TTCAACAGTTCAATCTAT TTCCCAATATGACTGT ACT AGAAAATATTACTTT ATCACCT A 
TTCAACAGTTCAATCTATTTCCCAATATGACTGTACTAGAAAATATTACTTTATCACCTA 
T TCAACAGT TCAAT CTATTTCCCAATATGACTGTACTAGAAAATATTACTTT ATCACCT A 
TTCAACAGTTCAATCTATTTCCCAATATGACTGTACTAGAAAATATTACTTTATCACCTA 
TTCAACAGTTCAATCTATTTCCCAATATGACTGTACTAGAAAATATTACTTTATCACCTA 
TTCAACAGTTCAATCTATTTCCCAATATGACTGTACTAGAAAATATTACTTTATCACCTA 
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SEQ1501 
SEQ1502 
SEQ1503 
SEQ1504 
SEQ1505 
SEQ1506 
SEQ1507 
SEQ1508 
SEQ1509 
SEQ1510 
SEQ1511 

SEQ1501 
SEQ1502 
SEQ1503 
SEQ1504 
SEQ1505 
SEQ1506 
SEQ1507 
SEQ1508 
SEQ1509 
SEQ1510 
SEQ1S11 

SEQ1501 
SEQ1502 
SEQ1503 
SEQ1504 
SEQ1505 
SEQ1506 
SEQ1507 
SEQ1508 
SEQ1509 
SEQ1510 
SEQ1511 

SEQ1501 
SEQ1502 
SEQ1503 
SEQ1504 
SEQ1505 
SEQ1506 
SEQ1507 
SEQ1508 
SEQ1509 
SEQ1510 
SEQ1511 

SEQ1501 
SEQ1502 
SEQ1503 
SEQ1504 
SEQ1505 
SEQ1506 
SEQ1507 
SEQ1508 
SEQ1509 
SEQ1510 
SEQ1511 



Table 15: Comparative Sequences relating t SAG0492 

TT AAGACAAAGGG ACT T TCTAAGCTTGAT GCTCAGACAAAAGCAT ACGAGCTACTTGAAA 

TTAAGACAAAGGGGCTTTCTAATCTTGATGCTCAGACAAAAGCATATGAGCTACTTGAAA 

TTAAGACAAAGGGGCTTTCTAATCTTGATGCTCAGACAAAAGCATATGAGCTACTTGAAA 

TTAAGACAAAGGGACTTTCTAAGCTTGATGCTCAGACAAAAGCATACGAGCTACTTGAAA 

TTAAGACAAAGGGACTTTCTAAGCTTGATGCTCAGACAAAAGCATACGAGCTACTTGAAA 

TTAAGACAAAGGGGCTTTCTAAGCTTGATGCTCAGACAAAAGCATATGAGCTACTTGAAA 

TTAAGACAAAGGGACTTTCTAAGCTTGATGCTCAGACAAAAGCATACGAGCTACTTGAAA 

TTAAGACAAAGGGGCTTTCTAAGCTTGATGCTCAGACAAAAGCATATGAGCTACTTGAAA 

TTAAGACAAAGGGGCTTTCTAAGCTTGATGCTCAGACAAAAGCATATGAGCTACTTGAAA 

TTAAGACAAAGGGACTTTCTAAGCTTGATGCTCAGACAAAAGCATACGAGCTACTTGAAA 

T TAAGACAAAGGG ACT TTCTAAGCT TGATGCT CAG ACAAAAGCATACGAGCTACTTGAAA 

AAGTTGGACTCAAAGAGAAGGCTAATGCTTATCCAGCTAGCTTATCTGGAGGACAACAAC 
AAGTTGGACTCAAAGAGAAGGCTAATACTTATCCAGCTAGCTTATCTGGAGGACAACAAC 
AAGTTGGACTCAAAGAGAAGGCTAATACTTATCCAGCTAGCTTATCTGGAGGACAACAAC 
AAGTTGGACTCAAAGAGAAGGCTAATGCTTATCCAGCAAGCTTATCTGGAGGACAACAAC 
AAGTTGGACTCAAAGAGAAGGCTAATGCTTATCCAGCTAGCTTATCTGGAGGGCAACAAC 
AAGTTGGACTCAAAGAGAAGGCTAATACTTATCCAGCTAGCTTATCTGGAGGACAACAAC 
AAGTTGGACTCAAAGAGAAGGCTAATGCTTATCCAGCTAGCTTATCTGGAGGACAACAAC 
AAGTTGGACTCAAAGAGAAGGCTAATACTTATCCAGCTAGCTTATCTGGAGGACAACAAC 
AAGTTGGACTCAAAGAGAAGGCTAATACTTATCCAGCTAGCTTATCTGGAGGACAACAAC 

AAGTTGGACTCAAAGAGAAGGCTAATGCTTATCCAGCAAGCTTATCTGG 

AAGTTGGACTCAAAGAGAAGGCTAATGCTTATCCAGCAAGCTTATCTGGTABCMARATVS 

ACGGATTGCTATTGCAAGAGGTCTTGCAATGAATCCTGATGTCCTTCTTTTTGATGAAC 
ACGAATTGCTATTGCAAGAGGTCTTGCAATGAATCCTCATGTCCTTCTTTTTGATGAAC 
ACGAATTGCTATTGCAAGAGGTCTTGCAATGAATCCTGATGTCCTTCTTTTTGATGAAC 
ACGGATTGCTATTGCAAGAGGTCTTGCAATGAATCCTGATGTCCTTCTTTTTGATGAAC 
ACGAATTGCTATTGCAAGAGGTCTTGCAATGAATCCTGATGTCCTTCTTTTTGATGAAC 
ACGAATTGCTATTGCAAGAGGTCTTGCAATGAATCCTGATGTCCTTCTTTTTGATGAAC 
ACGAATTGCTATTGCAAGAGGTCTTGCAATGAATCCTGATGTCCTTCTTTTTGATGAAC 
ACGAATTGCTATTGCAAGAGGTCTTGCAATGAATCCTGATGTCCTTCTTTTTGATGAAC 
ACGAATTGCTATTGCAAGAGGTCTTGCAATGAATCCTGATGTCCTTCTTTTTGATGAAC 

NCSRATNGTSAG 

TACTTCAGCTCTTGATCCTGAAATGGTAGGTGAAGTCTTGACTGTTATGCAAGATTTAG 
TACTTCAGCTCTTGATCCTGAAATGGTAGGTGAAGTCTTGACTGTTATGCAAGATTTAG 
TACTTCAGCTCTTGATCCTGAAATGGTAGGTGAAGTCTTGACTGTTATGCAAGATTTAG 
TACTTCAGCTCTTGATCCTGAAATGGTAGGTGAAGTCTTGACTGTTATGCAAGATTTAG 
TACTTCAGCTCTTGATCCTGAAATGGTAGGTGAAGTCTTGACTGTTATGCAAGATTTAG 
TACTTCAGCTCTTGATCCTGAAATGGTAGGTGAAGTCTTGACTGTTATGCAAGATTTAG 
TACTTCAGCTCTTGATCCTGAAATGGTAGGTGAAGTCTTGACTGTTATGCAAGATTTAG 
TACTTCAGCTCTTGATCCTGAAATGGTAGGTGAAGTCTTGACTGTTATGCAAGATTTAG 
TACTTCAGCTCTTGATCCTGAAATGGTAGGTGAAGTCTTGACTGTTATGCAAGATTTAG 



TAAATCTGGTATGACGATGGTTATTGTCACTCATGAAATGGGTTTTGCACGTGAAGTAG 
TAAATCTGGTATGACGATGGTTATTGTCACTCATGAAATGGGTTTTGCACGTGAAGTAG 
TAAATCTGGTATGACGATGGTTATTGTCACTCATGAAATGGGTTTTGCACGTGAAGTAG 
TAAATCTGGTATGACGATGGTTATTGTCACTCATGAAATGGGTTTTGCACGTGAAGTAG 
TAAATCTGGTATGACGATGGTTATTGTCACTCATGAAATGGGTTTTGCACGTGAAGTAG 
TAAATCTGGTATGACGATGGTTATTGTCACTCATGAAATGGGTTTTGCACGTGAAGTAG 
TAAATCTGGTATGACGATGGTTATTGTCACTCATGAAATGGGTTTTGCACGTGAAGTAG 
TAAATCTGGTATGACGATGGTTATTGTCACTCATGAAATGGGTTTTGCACGTGAAGTAG 
TAAATCTGGTATGACGATGGTTATTGTCACTCATGAAATGGGTTTTGCACGTGAAGTAG 
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Table 15: C mparative Sequences relating to SAG0492 



* 



SEQ1501 
SEQ1502 
SEQ1503 
SEQ1504 
SEQ1505 
SEQ1506 
SEQ1507 
SEQ1508 
SEQ1509 
SEQ1510 
SEQ1511 

SEQ1501 
SEQ1502 
SEQ1503 
SEQ1504 
SEQ1505 
SEQ1506 
SEQ1507 
SEQ150B 
SEQ1509 
SEQ1510 
SEQ1511 



GGATCGTGTCATTTTTATGGATGCAGGCATTATTGT-GAGCAAGGGACCCCTAAGGAAG 

GGATCGTGTCATTTTTATGGACGCAGAAATTAT 

GGATCGTGTCATTTTTATGGATGCAGGAATTATTGTTGAGCAAGGGGCCC 

GGATCGTGTCATTTTTATGGATGCAGGGATTATTGTTGAGCAAGGGACCCCTAAGAAAG 
GGATCGTGTCATTTT TATGGAT GCAGGCATT ATTGTTG ASCAAGGGACCCCTAAGGAAG 
GGATCGTGTCATTTTTATGGATGCAGGAATTATTGTGAGCAAGGGGCCCCTAAGGAAGT 

GGATCGTGTC-TTTTTATGGATGCGGGAATTATTGT-GAGCAAGGGACC 

GGATCGTGTCATTTTTATGGATGCASGAATTATTGTTGAGCAAGGGGCCCCTAAGGAAG 
GGATCGTGTCATTTTTATGGATGCAGGAATTATTGTTGAGCAAGGGGCCCCTAAGGAAG 



AT- 



AT- 
A— 



TTTGAGCAGACAAAAGAAATCCGCACAAGAGATTTCTT 



AT- 



ATTT AGCAAAACAAAAGAAAT - 
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^pble 16: Comparative Sequences relating to SAG0767 (D-alanine - D-alanine ligase) 

SEQ ID NO 1601- SAG0767 FROM THE M781 GBS TYPE III STRAIN 

TGGTCGCTCTGTCGGAACGTGAAGTATCTGTACTGTCTGCAGAAAGCGTCATGCGTGCTATTAATTATGATAAATTTTTT 
GTTAAAACTTATTTTATCACGCAAGTAGGTCAATTTATTAAARCACAAGAATTTGATGAAATGCCATCTTCAGATGAAAA 
GTTAMGACAAACCAAACTGTTGATTTAGACAAAATGGTTCGTCCAAGTGATATCTATGATGA 
CCGTTTTACATGGACC^TGGGGGAAGATGGTTCTATCCAAG^ 

AATATTCTATCTTCAAGCGTGGCTATGGATAAAATTACAACAAAACAAGTCCTTGCAACAGTAGGTGTACCTCAGGTTGC 
ATATCAAACTTATTTTGAGGGTGATGATTTGGAACATGCGATTAAACTCTCTTTAGAAACTTTAAGTTTCCCAATTTTTG 
TAAAACCGGCTAATATGGGGTCATCAGTAGGTATTTCAAAAGCGACAGATGAATCCTCACTTCGCTCTGCAATTGACTTA 
GCTCTCAAGTATGATAGCCGTATTTTGATTGAACAAGGCGTGACAGCTCGTGAAATTGAAGTAGGTATTTTAGGCAATAA 
TGATGTTAAGACAACTTTTCCTGGCGAAGTTGTTAAAGACGTCGATTTCTATGACTATGACGCCAAATATATTGATAATA 
AAATTACTATGGATATTCCAGCTAAAGTTGATGAAGCAACTATGG7VAGCAATGCGTCAATATGCAAGTAAAGCTTTTAAA 
GCAATCGGGGCTTGTGGTTTATCACGCTGTGATTTCTTTTTGACGAAAGATGGACAAATCTTCTTAAACGAACTGAATAC 
AATGCCCGGTTTTACTCAGTGGTCAATGTATCCTCTGCTTTGGGAAAATATGGGGCTAACTTATAGTGATTTGATTG 

SEQ ID NO. 1602: SAG0767 FROM THE 090 GBS TYPE la STRAIN 

AAACCGGGCATTGTATTCAGTTCGTTTAAGAAGACTTGTCCATCTTTCGTCAAAAAGAAATCACAGCGTGATAAACCACA 
AGCCCCGATTGCTTTAAAAGCTTTACTTGCATATTGACGCATTGCTTCCATAGTTGCTTCATCAACTTTAGCTGGAATAT 
(-•/'jvm»r!>t»ikii'pn"PT>a'P , pa , prrnAT7VTATTTGGCGTCATAGTCATAGAAATCGACGTCTTTAACGACTTCGCCAGGAAAAGTT 



AGCCCCGATTGCTTTAAAAGCTTTACTTGCATATTGACGCATTGCTTCCATAGTTGCTTCATCAACTTTA^l^^i^i 
CCATAGTAATTTTATTATCAATATATTTGGCGTCATAGTCATAGAAATCGACGTCTTTAACGACTTCGCCAGGAAAAGTT 
GTCTTAAC^TCATTATTGCCTAAAATACCTACTTCAATTTCACGAGCTGTCACGCCTTGTTCAATCAAAATACGGCTATC 
ATACTTGAGAGCTAAGTCAATksCAGAGCGAAGTGAGGATTCATCTGTCGCTTTTGAAATACCTACTGATGACCC 
TAGCCGGTTTTACAAAAATTGGGAAACTTAAAGTTTCTAAAGAGAGTTTAATCGCATGTTCCAAATCATCACCCTCAAAA 
TAAGTTTGATATGCAACCTGAGGTACACCTACTGTTGCAAGGACTTGTTTTGTTGTAATTTTATCCATAGCCACGCTTGA 

AGATAGAATATTAGTCCCAACATAAGGCATCCTTAAAACTTCTAAAAATCCTTC 

CATGTAAAACGGGGAAAACAATTGCATTATCATGATAGATATCACTTGGACGAACCATTTTGTCTAAATCAACAGTTTGG 
TTTGTCATTAACTTTTCATCTGAAGATGGCATTTCATCAAATTCTTGTGTTTTAATAAATTGACCTACTTGCGTG 

SEQ ID NO. 1603: SAG0767 FROM THE COHl TXPE la STRAIN ,„, m ^ mm „ 

TCGCTCTGCGGAACGTGAAGTATCTGTACTGTCTGCAGAAAGCGTC^TGCGTGCTATTAATTATGATAAATTTTTTGTTA. 

AAACTTATTTTATCACGCAAGTAGGTCAATTTATTAAAACACAAGAATTTGATGAAATGCCATCTTCAGATGA^GTTA 

ATGACAAACCAAACTGTTGATTTAGACAAAATGGTTCGTCCAAGTGATATCTATGATGATAATGCAATTGTTTTCCCCGT 

TTTACATGGACCAATGGGGGAAGATGGTTCTATCCAAGGATTTTTAGAAGTTTTAAGGATGCCTTATGTTGGGACTAATA 

TTCTATCTTCAAGCGTGGCTAT 

SEQ ID NO. 1604: SAG0767 FROM THE CJB110 GBS NONTYPEABLE STRAIN (REVERSE 
CGTC^ATTTCTATGACTATGACGCCAAATATATTGATAATAAAATTACTATGGATATTCC^GCT 

CTATGGAAGCAATGCGTCAATATGCAAGTAAAGCTTTTAAAGCAATCGGGGCTTGTGGTTTATCACGCTGTGATTTCTTT 
TTGACGAAAGATGGACAAATCTTCTTAAACGAACTGAATACAATGCCC 

SEQ ID NO. 1605: SAG0767 FROM THE CJB110 GBS NONTYPEABIE STRAIN 

AACGTGAAGTATCTGTACTGCTCTGCAGAAAAGCGTCATGCGTGCTATTAATTATGATAAATTTTTTGTTAAAACTTATT 
TTATCACGCAAGTAGGTCAATTTATTAAAACACAAGAATTTGATGAAATGCCATCTTCAGATGAAAA 

SEQ ID NO. 1606: SAG0767 FROM THE 1169NT1 GBS TYPE V STRAIN (REVERSE COOTtBMEOT) 

CTAATATGGGGTCATCAGTAGGTATTTCAAAAGCGACAGATGAATCCTCACTTCGCTCTGCAATTGACTTAGCTCTCAAG 
TATGATAGCCGTATTTTGATTGAACAAGGCGTGACAGCTCGTGAAATTGAAGTAGGTATTTTAGGCAATAATGATGTTAA 
GACAACTTTTCCTGGCGAAGTCGTTAAAGACGTCGATTTCTATGACTATGACGCCAAATATATTGATAATAAAATTACTA 
TGGATATTCCAGCTAAAGTTGATGAAGCAACTATGGAAGCAATGCGTCAATATGCAAGTAAAGCTTTTAAAGCAATCGGG 
GCTTGTGGTTTATCACGCTGTGATTTCTTTTTGACGAAAGATGGACAAATCTTCTTAAACGAACTGAATACAATGCCCGG 

TTTTACTCAGTGGTCAATGTATCCTCTGCTTTGGGAAAAT 



CTGAATAi 




blV-s i^Sli-^t 3jil B 
^pble 16: Comparative Sequences relating to SAG0767 (D-alanine - ITalanine ligase) 

SEQ ID NO 1609: SAG0767 FROM THE 2603V/R GBS TYPE V STRAIN (REVERSE COMPLEMENT) 

GGCTATGGATAAAATTACAACAAAACAAGTCCTTGCAACAGTAGGTGTACCTCAGGTTGCATATCAAACTTATTTTGAGG 

GTGATGATTTGGAACATGCGATTAAACTCTCTTTAGAAACTTTAAGTTTCCCAATTTTTGTAAAACCGGCTAATATGGGG 

TCATCAGTAGGTATTTCAAAAGCGACAGATGAATCCTCACTTCGCTCTGCAATTGACTTAGCTCTCAAGTATGATAGCCG 

TATTTTGATTGAACAAGGCGTGACAGCTCGTGAAATTGAAGTAGGTATTTTAGGCAATAATGATGTTAAGACAACTTTTC 

CTGGCGAAGTCGTTAAAGACGTCGATTTCTATGACTATGACGCCAAATATATTGATAATAAAATTACTATGGATATTCCA 

GCTAAAGTTGATGAAGCAACTATGGAAGCAATGCGTCAATATGCAAGTAAAGCTTTTAAAGCAATCGGGGCTTGTGGTTT 

ATCACGCTGTGATTTCTTTTTGACGAAAGAATGGACAAATCTTCTTAAACGAACTGAAATAC 

SEQ ID NO. 1610: SAG0767 FROM THE 2603V/R GBS TYPE V STRAIN 

TCTGTACTGTCTGCAGAAAGCGTCATGCGTGCTATTAATTATGATAAATTTTTTGTTAAAACTTATTTTATCACGCAAGT 
AGGTCAATTTATTAAAACACAAGAATTTGATGAAATGCCATCTTCAGATGAAAAGTTAATGACAAACCAAACTGTTGATT 

TAGACAAAATGGTTCGTCCAAGTGATATCTATGATGATAAT 

SEQ ID NO. 1611: SAG0767 FROM THE H36b GBS TYPE lb STRAIN (REVERSE COMPLEMENT) 

AAAACCGGCTAATATGGGGTCATCAGTAGGTATTTCAAAAGCGACAGATGAATCCTCACTTCGCTCTGCAATTGACTTAG 

CTCTCAAGTATGATAGCCGTATTTTGATTGAACAAGGCGTGACAGCTCGTGAAATTGAAGTAGGTATTTTAGGCAATAAT 

GATGTTAAGACAACTTTTCCTGGCGAAGTCGTTAAAGACGTCGATTTCTATGACTATGACGCCAAATATATTGATAATAA 

AATTACTATGGATATTCCAGCTAAAGTTGATGAAGCAACTATGGAAGCAATGCGTCAATATGCAAGTAAAGCTTTTAAAG 

CAATCGGGGCTTGTGGTTTATCACGCTGTGATTTCTTTTTGACGAAAGATGGACAAATCTTCTTAAACGAACTGAATACA 

ATGCCCGGTTTTACTCAGTGGTCAATGTATCCCCTGCTTTGGGAAAATATGGGGCTAACTTATAG 

SEQ ID NO. 1612: SAG0767 FROM THE H36b TYPE lb STRAIN 

CGTGAAGTATCTGTACTGTCTGCAGAAAGCGTCATGCGTGCTATTAATTATGATAAATTTTTTGTTAAAACTTATTTTAT 
CACGCAAGTAGGTCAATTTATTAAAACACAAGAATTTGATGAAATGCCATCTTCAGATGAAAAGTTAATGACAAACCAAA 
CTGTTGATTTAGACAAAATGGTTCGTCCAAGTGATATCTATGATGATAATGCAATTGTTTTCCCCGTTTTACATGGACCA 
ATGGGGGAAGATGGTTCTATCCAAGGATTTTTAGAAGTTTTAAGGATGCCTTATGTTGGGACTAATATTCTATCTTCAAG 
CGTGGCTATGGATAAAATTACAACAAAACAAGTCCTTGCAACAGTAG 

SEQ ID NO. 1613: SAG0767 FROM THE M732 GBS TYPE III STRAIN (REVERSE COMPLEMENT) 

ATGCGATTAAACTCTCTTTAGAACCTTTAAGTTTCCCAATTTTTGTAAACCCGGCTAATATGGGGTCATCAGTAGGTATT 

TCAAAAGCGACAGATGAATCCTCACTTCGCTCTGCAATTGACTTAGCTCTCAAGTATGATAGCCGTATTTTGATTGAACA 

AGGCGTGACAGCTCGTGAAATTGAAGTAGGTATTTTAGGCAATAATGATGTTAAGACAACTTTTCCTGGCGAAGTTGTTA 

AAGACGTCGATTTCTATGACTATGACGCCAAATATATTGATAATAAAATTACTATGGATATTCCAGCTAAAGTTGATGAA 

/-/-n jv r*<ni\ mr?nn icr-a n •mr-f^Tr'a nTaTRP A»f5T AAAGCTTTTAAAGCAATCGGGGCTTGTGGTTTATCACGCTGTG ATTT 




TGCTTTGGGAAAATATGGGGCTAACTT 
SEQ ID NO. 1614: SAG0767 FROM THE M732 GBS TYPE III STRAIN 

GTCATGCCGTGCTATTAATTATGATAAATTTTTTGTTAAAACTTATTTTATCACGCAAGTAGGTCAATTTATTAAAACAC 
AAGAATTTGATGAAATGCCATCTTCAGATGAAAAGTTAATGACAAACCAAACTGTTGATTTAGACAAAATGGTTCGTCCA 
AGTGATATCTATGATGATAATGCAATTGTTTTCCCCGTTTTACATGGACCAATGGGGGAAGATGGTTCTATCCAAGGATT 
TTTAGAAGTTTTAAGGATGCCTTATGTTGGGACTAATATTCTATCTTCAAGCGTGGCTATGGATAAAATTACAACAAAAC 

AAGTCCTTGCAACAGTAGGTGTACCTCAGG 

SEQ ID NO. 1615: SAG0767 FROM THE A909 GBS TYPE la STRAIN (REVERSE COMPLEMENT) 

TTTTGAGGGTGATGATTTGGAACATGCGATTAAACTCTCTTTAGAAACTTTAAGTTTCCCAATTTTTGTAAAACCGGCTA 

ATATGGGGTCATCAGTAGGTATTTCAAAAGCGACAGATGAATCCTCACTTCGCTCTGCAATTGACTTAGCTCTCAAGTAT 

GATAGCCGTATTTTGATTGAACAAGGCGTGACAGCTCGTGAAATTGAAGTAGGTATTTTAGGCAATAATGATGTTAAGAC 

AACTTTTCCTGGCGAAGTCGTTAAAGACGTCGATTTCTATGACTATGACGCCAAATATATTGATAATAAAATTACTATGG 

ATATTCCAGCTAAAGTTGATGAAGCAACTATGGAAGCAATGCGTCAATATGCAAGTAAAGCTTTTAAAGCAATCGGGGCT 

TGTGGTTTATCACGCTGTGATTTCTTTTTGACGAAAGATGGACAAATCTTCTTAAACGAACTGAATACAATGCCCGGTTT 

TACTCAGTGGTCAATGTATCCCCTGCTTTGGGAAAATATGGGGCTAACTTATAGTGA 



TTAAAACTTATTTTATCACGCAAGTAGGTCAATTTATTAAAACACAAGAATTTGATGAAATGCCATCTTCAGATGAAAAG 
TTAATGACAAACCAAACTGTTGATTTAGACAAAATGGTTCGTCCAAGTGATATCTATGATGATAATGCAATTGTTTTCCC 
CGTTTTACATGGACCAATGGGGGAAGATGGTTCTATCCAAGGATTTTTAGAAGTTTTAAGGATGCCTTATGTTGGGACTA 
ATATTCTATCTTCAAGCGTGGCTATGGATAAAATTACAACAAAACAAGTCCTTGCAACAGTAGG 
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ble 16: Comparative Sequences relating t SAG0767 (D-alanine - D-alanine ligase) 



SEQ ID NO- 1617: SAG0767 FROM THE JM9130013 GBS TYPE VIII STRAIN (REVERSE 
COMPLEMENT ) 

AAGCAGGGGATACATTGACCACTGAGTAAAACCGGGCATTGTATTCAGTTCGTT.TAAGAAGATCTGTCCATCTTTCGTCA 
AAAAGAAATCACAGCGTGATAAACCACAAGCCCCGATTGCTTTAAAAGCTTTACTTGCATATTGACGCATTGCTTCCATA 
GATGCTTCATCAACTTTAGCTGGAATATCCATAGCAATTTTATTATCAATATATTTGGCG 

SEQ1601 GGTCGCTCTGTCGGAACGTGAAGTATCTGTACTGTCTGCAGAAAGCGTCATGCGTGCTA 

SEQ1602 

SEQ1603 

SEQ1604 

SEQ1605 

SEQ1606 

SEQ1607 

SEQ1608 

SEQ1609 

SEQ1610 

SEQ1611 

SEQ1612 " 

SEQ1613 

SEQ1614 

SEQ1615 ; 

SEQ1616 ' 

SEQ1617 

SEQ1601 TAATTATGATAAATTTTTTGTTAAAACTTATTTTATCACGCAAGTAGGTCAATTTATTA 

SEQ1602 

SEQ1603 

SEQ1604 

SEQ1605 

SEQ1606 * '■ 

SEQ1607 

SEQ1608 

SEQ1609 

SEQ1610 

SEQ1611 

SEQ1612 

SEQ1613 " 

SEQ1614 

SEQ1615 

SEQ1616 

SEQ1617 

SEQ1601 AACACAAGAATTTGATGAAATGCCATCTTCAGATGAAAAGTTAATGACAAACC/^AACTG 

SEQ1602 

SEQ1603 

SEQ1604 ; 

SEQ1605 

SEQ1606 

SEQ1607 

SEQ1608 

SEQ1609 

SEQ1610 

SSQ1611 

SEQ1612 

SEQ1613 

SEQ1614 

SEQ1615 

SEQ1616 

SEQ1617 



Table 16: Comparative Sequences relating to SAG0767 (D-alanine - iWanine Ugase) 

SE^^l TGATT TAG ACAAAATGGT T CGTCCAAGTGATATCTATGATG ATAATGCAATTGTTTTCC 

SEQX602 - 

SEQ1603 ~ 

SEQ1604 ~_ ~ 

SEQ1605 

SEQ1606 

SEQ1607 

SEQ1608 ~ 

SEQ1609 " 

SEQ1610 ™ " 

SEQ1611 " 

SEQ1612 ~ 

SEQ1613 Z 

SEQ1614 ZZZJl- 

SEQ1615 "Z Z 

SEQ1616 

SEQ1617 : 

SEQ1601 CGTTTTACATGGACCAATGGGGGAAGATGGTTCTATCCAAGGATTTTTAGAAGTTTTAA 

SEQ1602 " 

SEQ1603 

SEQ1604 

SEQ1605 " 

SEQ1606 _ 

SEQ1607 " Z 

SEQ1608 ~ ' Z Z Z 

SEQ1609 " _~ 

SEQ1610 

SEQ1611 _~ 

SEQ1612 ~ 

SEQ1613 

SEQ1614 Z 

SEQ1615 ZZ 

SEQ1616 ~ ~_ 

SEQ1617 

SEQ1601 GATGCCTTATGTTGGGACTAATATTCTATCTTCAAGCGTGGCTATGGATAAAATTACAA 

SEQ1602 " Z 

SEQ1603 

SEQ1604 ; 

SEQ1605 ~" 

SEQ1606 Z "Z" 

SEQ1607 " ZZZZZ 

SEQ1608 

SEQ1 609 GGCT ATGG AT AAAAT T ACAA 

SEQ1610 "" 

SEQ1611 t " " " Z 

SEQ1612 """"" 

SEQ1613 ZZ"Z 

SEQ1614 ~ 

SEQ1615 ~ 

SEQ1616 ~ ~ 

SEQ1617 
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• 



SEQ1601 

SEQ1602 

SEQ1603 

SEQ1604 

SEQ1605 

SEQ1606 

SEQ1607 

SEQ1608 

SEQ1609 

SEQ1610 

SEQ1611 

SEQ1612 

SEQ1613 

SEQ1614 

SEQ1615 

SEQ1616 

SEQ1617 

SEQ1601 

SEQ1602 

SEQ1603 

SEQ1604 

SEQ1605 

SEQ1606 

SEQ1607 

SEQ1608 

SEQ1609 

SEQ1610 

SEQ1611 

SEQ1612 

SEQ1613 

SEQ1614 

SEQ1615 

SEQ1616 

SEQ1617 

SEQ1601 

SEQ1602 

SEQ1603 

SEQ1604 

SEQ1605 

SEQ1606 

SE01607 

SEQ1608 

SEQ1609 

SEQ1610 

SEQ1611 

SEQ1612 

SEQ1613 

SEQ1614 

SEQ1615 

SEQ1616 

SEQ1617 

SEQ1601 
SEQ1602 
SEQ1603 
SEQ1604 
8EQ1605 
SEQ1606 
SEQ1607 
SEQ160B 
SEQ1609 



tble 16: Comparative Sequences relating to SAG0767 (D-alanin - 



t 

-D*Sla 



a nine ligase) 



AAAACAAGTCCTTGCAACAGTAGGTGTACCTCAGGTTGCATATCAAACTTATTTTGAGG 

AAAACAAGTCCTTGCAACAGTAGGTGTACCTCAGGTTGCAT ATCAAACTTAT TTT GAGG 

TTTTGAGG 

TGATGATTTGGAACATGCGATTAAACTCTCTTTAGAAACTTTAAGTTTCCCAATTTTTG 

TGATGATTTGGAACATGCGATTAAACTCTCTTTAGAAACTTTAAGTTTCCCAATTTTTG 

ATGCGATTAAACTCTCTTTAGAACCTTTAAGTTTCCCAATTTTTG 

TGATGATTTGGAACATGCGATTAAACTCTCTTTAGAAACTTTAAGTTTCCCAATTTTTG 

AAAACCGGCTAATATGGGGTCATCAGTAGGTATTTCAAAAGCGACAGATGAATCCTCAC 

AAACCGGGC 

TCGCT CTGCGGAACGTGAAGT ATCTGT ACTG-TCTGCAGAAA- GCGT 

AACGTGAAGTATCTGTACTGCTCTGCAGAAAAGCGT 

CTAATATGGGGTCATCAGTAGGTATTTCAAAAGCGACAGATGAATCCTCAC 

ATCTGTACTG-TCTGCAGAAAAGCGT 

AAAACCGGCTAATATGGGGTCATCAGTAGGTATTTCAAAAGCGACAGATGAATCCTCAC 

TCTGTACTG-TCTGCAGAAA-GCGT 

AAAACCGGCTAATATGGGGTCATCAGTAGGTATTTCAAAAGCGACAGATGAATCCTCAC 

CGTGAAGTATCTGTACTG-TCTGCAGAAA-GCGT 

AAACCCGGCT^TATGGGGTCATCAGTAGGTATTTCAAAAGCGACAGATGAATCCTCAC 

GT 

AAAACCGGCTAATATGGGGTCATCAGTAGGTATTTCAAAAGCGACAGATGAATCCTCAC 

TGGTCGCTCTGCGGAACGTGAAGTATCTGTACTG-TCTGCAGAAA-GCGT 

AAGCAGGGGATACATTGACCACTGAGTAAAACCGGGC 

TCGCTCTGCAATTGACTTAGCTCTCAAGTATGATAGCCGTATTTTGATTGAACAAGGCG 
TTGT-ATTCAGTTCGTTTAAGAAGACTTGTCCATCTTTCGTCAAAAAGAAATCACAGCG 
ATGC-GTGCTATTAATTATGATAAATTTTTTGTTAAAACTTATTTTATCACGCAAGTAG 

ATGC-GTGCTATTAATTATGATAAATTTTTTGTTAAAACTTATTTTATCACGCAAGTAG 
TCGCTCTGCAATTGACTTAGCTCTCAAGTATGATAGCCGTATTTTGATTGAACAAGGCG 

TTGACTTAGCTCTCAAGTATGATAGCCGTATTTTGATTGAACAAGGCG 

ATGC-GTGCTATTAATTATGATAAATTTTTTGTTAAAACTTATTTTATCACGCAAGTAG 
TCGCTCTGCAATTGACTTAGCTCTCAAGTATGATAGCCGTATTTTGATTGAACAAGGCG 
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ft 



SEQ1610 
SEQ1611 
SEQ1612 
SEQ1613 
SEQ1614 
SEQ1615 
SEQ1616 
SEQ1617 



SEQ1601 

SEQ1602 

SEQ1603 

SEQ1604 

SEQ1605 

SEQ1606. 

SEQ1607 

SEQ1608 

SEQ1609 

SEQ1610 

SEQ1611 

SEQ1612 

SEQ1613 

SEQ1614 

SEQ1615 

SEQ1616 

SEQ1617 

SEQ1601 

SEQ1602 

SEQ1603 

SEQ1604 

SEQ1605 

SEQ1606 

SEQ1607 

SEQ1608 

SEQ1609 

SEQ1610 

SEQ1611 

SEQ1612 

SEQ1613 

SEQ1614 

SEQ1615 

SEQ1616 

SEQ1617 

SEQ1601 

SEQ1602 

SEQ1603 

SEQ1604 

SEQ1605 

SEQ1606 

SEQ1607 

SEQ1608 

SEQ1609 

SEQ1610 

SEQ1611 

SEQ1612 

SEQ1613 

SEQ1614 

SEQ1615 

SEQ1616 

SEQ1617 



able 16: Comparative Sequences relating to SAG0767 (D-alanine- ITalanine ligase) 



-D-ala 



ATGC-GTGCTATTAATTATGATAAATTTTTTGTTAAAACTTATTTTATCACGCAAGTAG 
TCGCTCTGCAATTGACTTAGCTCTCAAGTATGATAGCCGTATTTTGATTGAACAAGGCG 
ATGC-GTGCTATTAATTATGATAAATTTTTTGTTAAAACTTATTTTATCACGCAAGTAG 
TCGCTCTGCAATTGACTTAGCTCTCAAGTATGATAGCCGTATTTTGATTGAACAAGGCG 
ATGCCGTGCTATTAATTATGATAAATTTTTTGTTAAAACTTATTTTATCACGCAAGTAG 
TCGCTCTGCAATTGACTTAGCTCTCAAGTATGATAGCCGTATTTTGATTGAACAAGGCG 
ATGC-GTGCTATTAATTATGATAAATTTTTTGTTAAAACTTATTTTATCACGCAAGTAG 
TTGT-ATTCAGTTCGTTTAAGAAGATCTGTCCATCTTTCGTCAAAAAGAAATCACAGCG 

GACAGCTCGTGAAATT GAAGT AGGT ATTTTAGGCAAT AATGATGT TAAGACAACTTTTC 

GATAAACCACAAGC CCCGATTGCTTTAAAAGCTTTACTTGCATATTGACGCATTG 

GTCAATTTATTAAA ACACAAGAATTTGATGAAATGCCATCTTCAGATGAAAAGTTA 

GTCAATTTATTAAA ACACAAGAAT TTGATGAAAT GCC ATCTTCAGATGAAAA 

GACAGCTCGTGAAATTGAAGTAGGTATTTTAGGCAATAATGATGTTAAGACAACTTTTC 
GACAGCTCGTGAAATTGAAGTAGGTATTTTAGGCAATAATGATGTTAAGACAACTTTTC 

GTCAATTTATTAAA ACACAAGAATTTGATGAAATGCCATCTTCAGATGAAAAGTTA 

GACAGCTCGTGAAATTGAAGTAGGTATTTTAGGCAATAATGATGTTAAGACAACTTTTC 

GTCAATTTATTAAA ACACAAGAATTTGATGAAATGCCATCTTCAGATGAAAAGTTA , 

GACAGCTCGTGAAATTGAAGTAGGTATTTTAGGCAATAATGATGTTAAGACAACTTTTC 

GTCAATTTATTAAA ACACAAGAATTTGATGAAATGCCATCTTCAGATGAAAAGTTA 

GACAGCTCGTGAAATTGAAGTAGGTATTTTAGGCAATAATGATGTTAAGACAACTTTTC 

GTCAATTTATTAAA ACACAAGAATTTGATGAAATGCCATCTTCAGATGAAAAGTTA 

GACAGCT CGTGAAATT GAAGT AGGT ATTTTAGGCAAT AATGATGTT AAGACAACTTT TC 

GTCAATTTATTAAA ACACAAGAATTTGATGAAATGCCATCTTCAGATGAAA^GTTA 

GATAAACCACAAGC CCCGATTGCTTTAAAAGCTTTACTTGCATATTGACGCATTG 

TGGCGAAGTTGTTAAAGACGTCGATTTCTATGA — CTATGACGCCAAAT-ATATTGATA 

TTCCATAGTT GCTTCAT CAACTT TAGCTGGAAT ATCCAT AGTAAT TT TATT ATCA 

TGACAAACC AAACTGTTGATTTAGACAAAATGGTTCGTCCAAGTGATATCTATG 

CGTCGATTTCTATGA — CTATGACGCCAAAT-ATATTGATA 

TGGCGAAGTCGTTAAAGACGTCGATTTCTATGA — CTATGACGCCAAAT-ATATTGATA 
TGGCGAAGTCGTT AAAGACGTCGATTTCTATGA — CTATGACGCCAAAT - AT ATTGAT A 

TGACAAACC AAACTGTTGATTTAGACAAAATGGTTCGTCCAAGTGATATCTATG 

TGGCGAAGTCGT TAAAGACGTCGATT TCTATGA — CTATGACGCCAAAT-ATATTGATA 

TGACAAACC AAACTGTTGAT TTAGACAAAATGGTTCGTCCAAGTGAT AT CTATG 

TGGCGAAGTCGTTAAAGACGT CGATTT CT ATGA — CTATGACGCCAAAT-ATATTGATA 

TGACAAACC AAACTGTTGATTTAGACAAAATGGTTCGTCCAAGTGATATCTATG 

TGGCGAAGTTGTTAAAGACGTCGATTTCTATGA — CTATGACGCCAAAT-ATATTGATA 

TGACAAACC AAACTGTTGATTTAGACAAAATGGTTCGTCCAAGTGATATCTATG 

TGGCGAAGTCGTTAAAGACGTCGATTTCTAT GA — CTATGACGCCAAAT - AT ATTGATA 

TGACAAACC AAACTGTTGATTTAGACAAAATGGTTCGTCCAAGTGATATCTATG 

TTCCATAGAT GCTTCATCAACTTTAGCTGGAATATCCATAGCAATTTTATTATCA 

TAAAAT TACT AT — GGATATTCCAGCTAAAGTTGATGAAGCAACTATGGAAGCAATGCG 
TATATTTGGCGTCATAGTCATAGAAATCGACGTCTTTAACGACTTCGCCAGG — AAAAG 

TGATAATGCAAT — TGTTTTCCCCGTTTTAC ATGGACCAATGGGGGAAG — ATGGT 

TAAAATTACTAT — GGATATTCCAGCTAAAGTTGATGAAGCAACTATGGAAGCAATGCG 

TAAAATTACTAT — GGATATTCCAGCTAAAGTTGATGAAGCAACTATGGAAGCAATGCG 
TAAAATTACTAT - -GGAT ATTCCAGCT AAAGTTGATGAAGCAACTATGGAAGCAATGCG 

TGATAATGCAAT — TGTTTTCCCCGTTTTAC ATGGACCAATGGGGGAAG — ATGGT 

TAAAATTACTAT — GGATATTCCAGCTAAAGTTGATGAAGCAACTATGGAAGCAATGCG 

TGATAAT 

TAAAATTACTAT — GGATATTCCAGCTAAAGTTGATGAAGCAACTATGGAAGCAATGCG 

TGATAATGCAAT — TGTTTTCCCCGTTTTAC ATGGACCAATGGGGGAAG — ATGGT 

TAAAATTACTAT — GGATATTCCAGCTAAAGTTGATGAAGCAACTATGGAAGCAATGCG 

TGATAATGCAAT — TGTTTTCCCCGTTTTAC ATGGACCAATGGGGGAAG — ATGGT 

TAAAATTACTAT — GGATATTCCAGCTAAAGTTGATGAAGCAACTATGGAAGCAATGCG 

TGATAATGCAAT — TGTTTTCCCCGTTTTAC ATGGACCAATGGGGGAAG — ATGGT 

TATATTTGGCGT ABLECMPARATIVESEQENCESRELA-TINGTSAGD — ALAN I 
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Table 16: Comparative Sequences relating to SAG0767 (D-alanine - 



anine ligase) 



SEQ1602 

SEQ1603 

SEQ1604 

SEQ1605 

SEQ1606 

SEQ1607 

SEQ1608 

SEQ1609 

SEQ1610 

SEQ1611 

SEQ1612 

SEQ1613 

SEQ1614 

SEQ1615 

SEQ1616 

SEQ1617 

SEQ1601 

SEQ1602 

SEQ1603 

SEQ1604 

SEQ1605 

SEQ1606 

SEQ1607 

SEQ1608 

SEQ1609 

SEQ1610 

SEQ1611 

SEQ1612 

SEQ1613 

SEQ1614 

SEQ1615 

SEQ1616 

SEQ1617 

SEQ1601 

SEQ1602 

SEQ1603 

SEQ1604 

SEQ1605 

SEQ1606 

SEQ1607 

SEQ1608 

SEQ1609 

SEQ1610 

SEQ1611 

SEQ1612 

SEQ1613 

SEQ1614 

SEQ1615 

SEQ1616 

SEQ1617 



CAAT ATGCAAGTAAAGCTTT TAAAGCAATCGGGGCTTGTGGTTT AT CACGCTGTGATTT 
T — GT CTT AAGATCATTATTGCCTAAAAT ACCTACTTCAATT TCACGAGCTGTC ACGCC 

c TATCCAAG — GATTTTTAGAAGTTTTAAGGATGCCTTATGTTGGGACTAATATTCT 

CAATATGCAAGTAAAGCTTTTAAAGCAATCGGGGCTTGTGGTTTATCACGCTGTGATTT 

CAATATGCAAGTAAAGCTTTTAAAGCAATCGGGGCTTGTGGTTTATCACGCTGTGATTT 
CAATATGCAAGTAAAGCTTTTAAAGCAATCGGGGCTTGTGGTTTATCACGCTGTGATTT 
C—TATCCAAG—GATTTTTAGAAGTTTTAAGGATGCCTTATGTTGGGACTAATATTCT 
CAAT ATGCAAGTAAAGCTTT TAAAGCAATCGGGGCTTGTGGTTT ATCACGCTGTGATTT 

CAATATGCAAGTAAAGCTTTTAAAGCAATCGGGGCTTGTGGTTTATCACGCTGTGATTT 
C — TATCCAAG — GAT TT T TAGAAGTTTTAAGGATGCCTTATGTTGGG ACTAATAT TCT 
CAATATGCAAGTAAAGCTTTTAAAGCAATCGGGGCTTGTGGTTTATCACGCTGTGATTT 
C — TATCCAAG — GAT TTT T AGAAGTTTTAAGGATGCCTTATGTTGGGACTAATATTCT 
CAATATGCAAGTAAAGCTTTTAAAGCAATCGGGGCTTGTGGTTTATCACGCTGTGATTT 
C — TATCCAAG — GATTTTTAGAAGTTTTAAGGATGCCTTATGTTGGG ACTAATAT TCT 
E — DALANINELIGASE 

TTTTTGACGAAAGA-- TGGACAAATCTTCTTAAACGAACTGAA-TACAATGCCCGGTTT 
TGTTCAATCAAAATACGGCTATCATACTTGAGAGCTAAGTCAATKSCAGAGCGAAGTGA 

TCTTCAAGCGTGGCTAT 

TTT TTGACGAAAGA — TGGACAAATCTTCTT AAACGAACTGAA- TACAATGCCC 

TTTTTGACGAAAGA — TGGACAAATCTTCTTAAACGAACTGAA-TACAATGCCCGGTTT 
TTTTTGACGAAAGA — TGGACAAATCTTCTTAAACGAACTGAA-TACAATGCCCGGTTT 

TCTTCAA — — — — — — — — 

TTT TTGACGAAAGAA- TGGACAAATCTTCTTAAACGAACT GAAAT AC 

TTTTTGACGAAAGA — TGGACAAATCTTCTTAAACGAACTGAA-TACAATGCCCGGTTT 

TCTTCAAGCGTGGCTATGGATAAAATTACAACAAAACAAGTCCTTGCAACAGTAG 

TTTTTGACGAAAGA — TGGACAAATCTTCTTAAACGAACTGAA-TACAATGCCCGGTTT 
TCTTCAAGCGTGGCTATGGATAAAATTACAACAAAACAAGTCCTTGCAACAGTAGGTGT 
TTTTTGACGAAAGA — TGGACAAATCTTCTTAAACGAACTGAA-TACAATGCCCGGTTT 
TCTTCAAGCGTGGCTATGGATAAAATTACAACAAAACAAGTCCTTGCAACAGTAGG 

ACTCAGTGGTCAATGTATCCTCTGCTTTGGGAAAA-TATGGGGCTAACTTATAGTGATT 
GATTCATCTGTCGCTT TTGAAATACCT ACTGATG ACCCCAT ATTAGCCGGTT TT ACAAA 

ACTCAGTGGTCAATGTATCCTCTGCTTTGGGAAAA-T 

ACTCAGTGGTCAATGTATCCCCTGCTTTGGGAAAAGTATGGGGCTAACCTT 

ACTCAGTGGTCAATGTATCCCCTGCTTTGGGAAAA-TATGGGGCTAACTTATAG 

ACTCAGTGGTCAATGTATCCTCTGCTTTGGGAAAA-TATGGGGCTAACTT 

CCTCAGG 

ACTCAGTGGTCAATGTATCCCCTGCTTTGGGAAAA-TATGGGGCTAACTTATAGTGA 
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^^ble 16: Comparative Sequences relating to SAG0767 (D-alanine - WTanine ligase) 

SEQ1601 GATTG ' 

SEQ1602 ATTGGGAAACTTAAAGTTTCTAAAGAGAGTTTAATCGCATGTTCCAAATCATCACCCTC 

SEQ1603 ~ ™" 

SEQ1604 ~ 

SEQ1605 " " 

SEQ1606 ^ ~ 

SEQ1607 

SEQ1608 

SEQ1609 ~_" " " 

SEQ1610 

SEQ1611 

SEQ1612 ~ " ~™ 

SEQ1613 "~™~ -Z 

SEQ1614 " ~" 

SEQ1615 " 

SEQ1616 " " 

SEQ1617 

SEQ1602 AAATAAGTTTGATATGCAACCTGAGGTACACCTACTGTTGCAAGGACTTGTTTTGTTGT 

SEQ1603 " Z Z -Z 

SEQ1604 " ~ Z ™ 

SEQ1605 ~ 

SEQ1606 " """" 

SEQ1607 

SEQ1608 ~ ~ ~_ 

SEQ1609 ~ 

SEQ1610 

SEQ1611 Z~Z 

SEQ1612 " ""III™ 

SEQ1613 

SEQ1614 """"I 

SEQ1615 _ 

SEQ1616 ~ " ^ 

SEQ1617 

SEQ1601 

SEQ1602 ATTTTATCCATAGCCACGCTTGAAGATAGAATATTAGTCCCAACATAAGGCATCCTTAA 

SEQ1603 " ~ ~ "~ 

SEQ1604 "~ 

SEQ1605 ""_ 

SEQ1606 ~ 

SEQ1607 " ~ -™ 

SEQ1608 ~ 

SEQ1609 " 

SEQ1610 """"" Z 

SEQ1611 " " 

SEQ1612 " ~ 

SEQ1613 ~_ Z 

SEQ1614 

SEQ1615 

SEQ1616 

SEQ1617 
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^^Wel6: Comparatlv Sequences relating t SAG0767 (D-alanine - flllanine ligase) 

SEQ1602 ACTTCTAAAAATCCTTGGATAGAACCATCTTCCCCCATTGGTCCATGTAAAACG 

SEQ1603 ~~ IZZ" 

SEQ1604 ~ 

SEQ1605 "7 2 ~ " 

SEQ1606 

SEQ1607 ~ I 

SEQ1608 ~™ Z "II" SZZZZ 

SEQ1609 ~ ~ " ^ ZZZJZ 

SEQ1610 _ ZZZJZ"" " ' 

SEQ1611 ""_ Z 

SEQ1612 ~ Z Z Z 

SEQ1613 ~ Z ZZIIIII 

SEQ1614 " III"" 

SEQ1615 ZZZZI 

SEQ1616 "III 

SEQ1617 

SEQ1602 ACAATTGCATT ATC ATCATAGATATCACT TGGACGAACCATT T TGTCT AAATCAACAGT 

SEQ1603 ~~ I IZZIZZIZZ 

SEQ1604 - "ZZZ—ZZZZZ 

SEQ1605 " "^ZZZZZZZ 

SEQ1606 """""" """I" 

SEQ1607 _ " _ ~_ ~_ _ 

SEQ1608 _ 

SEQ1609 ~~ JZ ZZZZ 

SEQ1610 ™ ~ I 

SEQ1611 ™~ " ~ ~ " 

SEQ1612 " ~ _ 

SEQ1613 " ~ Z "II ZZ-" 

SEQ1614 _I~_ ~ II 

SEQ1615 " IIIIIII 

SEQ1616 ~ I IIIIII 1 

SEQ1617 " 

SEQ1602 tgGTTTGT^TT^^ 

SEQ1603 IIIIIIIIIII 

SEQ1604 HI 

SEQ1605 ~™ IIIII 

SEQ1606 " ~ II 

SEQ1607 " IIIIII 

SEQ1608 " ™ _ "II 

SEQ1609 _ "IIIIII 

SEQ1610 "~ ~_ I I 

SEQ1611 ~ " IIIIIII 

SEQ1612 " IIII 

SEQ1613 1" 

SEQ1614 

SEQ1615 ~ IIIII 

SEQ1616 IIIIII 

SEQ1617 " 
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^le 16: Comparative Sequences relating to SAG0767 (D-alanine - ifflTanine ligase) 

SEQ1601 

SEQ1602 AATTGACCTACTTGCGTG 

SEQ1603 

SEQ1604 

SEQ1605 

SEQ1606 

SEQ1607 

SEQ1608 

SEQ1609 

SEQ1610 

SEQ1611 

SEQ1612 

SEQ1613 

SEQ1614 

SEQ1615 

SEQ1616 

SEQ1617 
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£j»3^^5J^fc>^^ aits* 'sua 

r^^l7: Comparative Sequences relating to SAG1086 (xanthine ph ph rib syltransferase) 



SEQ ID NO. 1701: SAG1086 FROM THE1169NT1 GBS NONTYPEABLE STRAIN 

TTTAAAGGTTGATTCCTTTTTGACTCATC^GGTAGATTTTGAGTTAATG^ 

CCGGCATTACGAAGGTTGTTACGATTGAAGCATCTGGAATTGCGCCAGCAGTGTACGCAGCTCAAGCATTGGGCGTACCAATGATAMT 

GCTAAAAAGGCTAAGAACATTACTATGACTGAAGGTATCTTAACTGCTGAAGTGTATTCTTTTAC^ 
TATTGT^GTCGCTTTTTATCTAACGATGATACTGTACTCATCATTGATGACTTTTTAGCAAACGGTCAAGCGGCT 
^rmmnr.mrPT7\>iiprnn nmr«r:^^ nrnrpoTi nn anTnTT^rr nanaTf^RCGTGATTTGTTAGAAAAA 



TATTGTGAGT CGCT TTTTAT CT AACGAT GAT ACT GT ACT CAT CAT i tj A i G/\^ a i i iirtbLftHHU^ 1 w^v^ov^ * 222 *™* *~ 
AAATTATTGGTCAAGCTGGAGCTAAGGTTGCTGGTATCGGAATCGTTATTGAAAAATCTTTCCAAGATGG 

ACAGGTGTTCCAGT 

SEQ ID NO. 1702: SAG0767 FROM THE 18RS21 GBS TYPE II STRAIN 

TT^GGTGAGAACATTTTAAAGGTTGATTCTTTTTTGACTCATCAGGTAGATTTTGAGTTAATGCAGGAAATAGGTAAAGTTTTTGCTG 
ATAAATATAAAGAAGCCGGCATTACGAAGGTTGTTACGATTGAAGCATCTGGAATTGCACCAGCAGTGTACGCAGCTCAAGCATTGGGC 

GkACCAATGATATTTGCTAAAAAAGCTAA^^ 

TACGAGTCAAGTTTCTATTGTGAGTCGCTTTTTATCTAACGATGATACTGTACTCATCATTGATGACTTTTTAGCAAACGGTCAAGCGG 

CTAAAGGATTACTTGAAATTATTGGTCAAGCTGGAGCTAAGGTTGCTGGTATCGGAATCGTTATTGA^ 

GATTTGTTAGAAAAAACA 

SEQ ID NO. 1703: SAG0767 FROM THE H36bl GBS TYPE lb STRAIN 

AAGAACGTATTCTTAAAGATGGTGATGTTTTAGGTGAGAAGATTTTAAAAGTTGATTCTTTTTTGACTCATCAGGTAGATTTTGAGTTA 

ATGCAGGAAATAGGTAAAGTTTTTGCTGATAAATATAAAGAAGCCGGCATTACGAAGGTTGTTACAATTGAAGCATCTGGAATTGCGCC 

AGCAGTGTACGCAGCTCAAGCATTGGGCGTACCAATGATATTTGCTAAAAAAGCTAAGAACATTACTATGACTGAAGG 

CTGAAGTGTATTCTTTTACAAAGCAAGTTACGAGTCAAGTTTCTATTGTGAGTCGCTTTTTATCTAACGATGATACTGTACTCATCATT 

GATGACTTTTTAGCAAACGGTCAAGCGGCTAAAGGATTACTTGAAATTATTGGTCAAGCTGGAGCTAAGGTTGCTGGTATCGGAATCYT 

TATTGAAAAATCTTTCCAAGATGGGCGTGATT 

SEQ ID NO. 1704: SAG0767 FROM THE M732 GBS TYPE III STRAIN 

ATTCTTTTTTGACTATC^GGTAAATTTTGAGTTAATGCAGGAAATAGGTAAAGTTTTTGCTGATAAATATAAAGAAGCCGGCATTACGA 
AGGTTGTTACAATTGAAGCATCTGGAATTGCGCCAGCAGTGTACGCAGCTCAAGCATTGGGCGTACCAATGATATOT 

AAGAACATTACTATGACTGAAGGTATCTT^ 
CTTTTTATCTAACGATGATACTGTACTCATCATTG^^ 

AAGCTGAAGCTAAGGTTGCTGGTATCGGAATCGTTATTGAAAAATCTTTCCAAGATGGGCGTGATTTGTTAGAAAA 
GTTACTTCTCTTGCTCGT 




ACATTTTAAAGGTTGATTCTTTTTTGACTCATCAGGTAGATTTTGAGTTAATGCAGLrAAATA^XMftAtoi in x^w w*x«™*r> A ~™~ 

GAAGCCGGCATTACGAAGGTTGTTACGATTGAAGCATCTGGAATTGCACCAGCAGT^ 

ATTTGCTAAAAAAGCTAAGAACATTACTATGACTGAAGGTATCTTAACTGCTGAA 

TTTCTAOTGTGAGTCGCTTTTTATCTAACGATGATACTGTACTCATCATTGATGACTTTTTAGCAAACMGTCT 

CTTGAAATTATTGGTCAAGCTGGAGCTAAGGTTGCTGGTATCGGAATCGTTATTGAAAAATCTTTCCAAGATGGGCGTGATTTGTTAGA 
AAA 

SEQ ID NO. 1707: SAGO 7 67 FROM THE A909 GBS TYPE la STRAIN (REVERSE COMPLEMENT) 

ACGTATTCTTAAAGATGGTGATGTTTTAGGTGAGAA 

AGGAAATAGGTAAAGTTTTTGCTGATAAATATAAAGAAGCCGGCATTACGAAGGTTGTTACAATTGAAGCATCTGGAATTGCGCCAG^ 

GTGTACGCAGCTCAAGCATTGGGCGTACCAATGATATTTGCTAAAAAAGCTAAG 

AGTGTATTCTTTTACAAAGCAAGTTACGAGTCAAGTTTCTATTGTGAGTCGCTTTTTATCT 

ACTTTTTAGCAAACGGKCAAGCGGSTAAAGGATTACTTGAAATTATTGGTCAAGCTGGAGCTA 

SEQ ID NO. 1708: SAG0767 FROM THE COH1 GBS TYPE la STRAIN 

TTTAAAAGTTGATTCTTTTTTGACTCATCAGGTAAATTTTGAGTTAATGCAGGA 

CCGGCATTACGAAGGTTGTTACAATTGAAGCATCTGGAATTGCGCCAGCAGTGTACGCAGCTCAAGCATTGGGCGTACCAATGAT 
GCTAAAAAAGCTAAGAACACTACTATGACTGAAGGTATCTTAACTGCTGAAGTGTATTCTTTTACAAAGCAAGT 

TATTGTGAGTCGCTTTTTATCTAACGATGATACTGTACTCATCATTGATGACTTTTTAGCAAACGGTCAAGCGGCTAAAGGATTACTT 

AAATTATTGGTCAAGCTGAAGCTAAGGTTGCTGGTATCGGAATCGTTATTGAAAAATCTTTC 

ACAGGTGTTCCGGTTAC 

SEQ ID NO. 1709: SAG0767 FROM THE CJB110 GBS NONTYPEABLE STRAIN (REVERSE 
COMPXiEMENT) 

GCTGRTAARTATAAAGAAGCCGGCATTACGAAGGTTGTTACAATTGAAGCATCTGGAATTGCGCCAGCRGTGTACGCMCT^G^TT 
GGGCGTACCAATGATATTTGCTAAAAAAGCTAAGAACATTACT^^ 

ARGTTACGAGTCAAGTTTCTATTGTGAGTCGCTTTTTATCTAACGATGATACTGTACTCATCATTGATGACTTTTTAGCA^CGGTCAA 
GCGGCTAAAGGATTACTTGAAATTTATTGGTCAAGCTGGAGCTAAGGTTGCTGGTATCGGAATCGTTATTGAAAAATCTTTC<!y^AGATG 

GGCGTGATTTGTTAGAAAAAACAGGTGTTCCAGT 
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T^^17: Comparative Sequences relating to SAG1086 (xanthine phophoribosyltransferase) 

SEQ ID NO. 1710: SAG0767 FROM THE 2603 V/R GBS TYPE V STRAIN 

aacgtattcttaaagatggtgatgttttaggtgagaacattttaaaagttgattcttttttgactcatcaggtagattttgagttaatg 

CAGGAAATAGGTAAAGTTTTTGCTGATAAATATAAAGAAGCCGGCATTACGAAGGTTGTTACAATTGAAGCATCTGGAATTGCGCCAGC 

AGTGTACGC^GCTCAAGCATTGGGCGTACCAATGATATTTGCTAAAAAAGCTAAGAACATTACTATGACTGAAGGTATCTT^ 

AAGTGTATTCTTTTACAAAGCAAGTTACGAGTCAAGTTTCTATTGTGAGTCGCTTTTTATCTAACGATGATACTGTACTCATCATTGAT 

GACTTTTTAGCAAACGGTCAAGCGGCTAAAGGATTACTTGAAATTATTGGTCAAGCTGGAGCTAAGGTTGCTGGTATCGGAATCGTTAT 

TGAAAAATCTTTCCAAGATGGGCGTGATTTGTTAGAAAAAACAGGTGTTCCAG 

SEQ ID NO, 1711: SAG0767 FROM THE JM9130013 GBS TYPE VIII STRAIN (REVERSE 
COMPLEMENT) 

ACGAAGGTTGTTACAATTGAAGCATCTGGAATTGCGCCAGCAGTGTACGCAGCTCAAGCATTGGGCGTACCAATGATATTTGCTAAAAA 
AGCTAAGAACATTACTATGACTGAAGGTATCTTAACTGCTGAAGTGTATTCTTTTACAAAGCAAGTTACGAGTCAAGTTTCTATTGTGA 
GTCGCTTTTTATCTAACGATGATACTGTACTCATCATTGATGACTTTTTAGCAAACGGTCAAGCGGCTAAAGGATTACTTGAAATTATT 

GGTCAAGCTGGAGCTAAGGTTGCTGGTATCGGA 

SEQ17 01 TTTAAAGGTTGATTCCT 

SEQ1702 TTTAGGTGAGAACATTTTAAAGGTTGATTCTT 

SEQ1703 AGAACGTATTCTTAAAGATGGTGATGTTTTAGGTGAGAACATTTTAAAAGTTGATTCTT 

SEQ17 04 ATTCT 

SEQ1705 -GAACGTATTCTTAAAGATGGTGATGTTTTAGGTGAGAACATTTTAAAAGTTGATTCTT 

SEQ1706 ACATTTTAAAGGTTGATTCTT 

SEQ1707 ACGTATTCTTAAAGATGGTGATGTTTTAGGTGAGAACATTTTAAAAGTTGATTCTT 

SEQ1708 TTTAAAAGTTGATTCTT 

SEQ1709 " 

SEQ1710 — AACGTATTCTTAAAGATGGTGATGTTTTAGGTGAGAACATTTTAAAAGTTGATTCTT 

SEQ1711 

SEQ1701 TTTGACTCATCAGGTAGATTTTGAGTTAATGCAGGAAATAGGTAAAGTTTTTGCTGATA 

SEQ1702 TTTGACTCATCAGGTAGATTTTGAGTTAATGCAGGAAATAGGTAAAGTTTTTGCTGATA 

SEQ1703 TTTGACT C ATCAGGTAGATTTTGAGTT AATGCAGGAAAT AGGTAAAGT T T T TGCTGATA 

SEQ1704 TTTTGACTATCAGGTAAATTTTGAGTTAATGCAGGAAATAGGTAAAGTTTTTGCTGATA 

SEQ1705 TTTGACTCATCAGGTAAATTTTGAGTTAATGCAGGAAATAGGTAAAGTTTTTGCTGATA 

SEQ1706 TTT GACTC ATCAGGTAGATTTTGAGTT AATGCAGGAAAT AGGTAAAGT T TT TGCTGATA 

SEQ1707 TTTGACTCATCAGGTAG ATTTTGAGT TAATGCAGGAAAT AGGTAAAGT TT TTGCTGAT A 

SEQ1708 TTTGACTCATCAGGTAAATTTTGAGT TAATGCAGGAAAT AGGTAAAGT TTTTGCTGATA 

SEQ1709 GCTGATA 

SEQ1710 TTTGACTCATCAGGTAGATTTTGAGTTAATGCAGGAAATAGGTAAAGTTTTTGCTGATA 

SEQ1711 

SEQ1701 ATATAAAGAAGCCGGCATT ACGAAGGT TGT T ACGATTGAAGCATCT GGAATT GCGCCAG 

SEQ17 02 ATATAAAGAAGCCGGCATT ACGAAGGTTGTTACGATTGAAGCATCTGGAATTGCACCAG 

SEQ1703 ATATAAAGAAGCCGGCATTACGAAGGTTGTTACAATTGAAGCATCTGGAATTGCGCCAG 

SEQ1704 ATATAAAGAAGCCGGCATTACGAAGGTTGTTACAATTGAAGCATCTGGAATTGCGCCAG 

SEQ1705 ATATAAAGAAGCCGGCATTACGAAGGTTGTTACAATTGAAGCATCTGGAATTGCGCCAG 

SEQ1706 ATATAAAGAAGCCGGCATTACGAAGGTTGTTACGATTGAAGCATCTGGAATTGCACCAG 

SEQ1707 ATATAAAGAAGCCGGCATTACGAAGGTTGTTACAATTGAAGCATCTGGAATTGCGCCAG 

SEQ1708 ATATAAAGAAGCCGGCATTACGAAGGTTGTTACAATTGAAGCATCTGGAATTGCGCCAG 

SEQ1709 ATATAAAGAAGCCGGCATT ACGAAGGT TGTT ACAATTGAAGCATCT GGAATTGCGCCAG 

SEQ1710 ATATAAAGAAGCCGGC ATT ACGAAGGTT GTT ACAATT GAAGC ATCTGGAATTGCGCCAG 

SEQ1711 ACGAAGGT TGT T ACAATTGAAGCATCTGGAAT T GCGCCAG 

SEQ1701 CAGTGTACGCAGCTCAAGCATTGGGCGTACCAATGATATTTGCTAAAAAGGCTAAGAACA 

SEQ1702 CAGTGTACGCAGCTCAAGCATTGGGCGKACCAATGATATTTGCTAAAAAAGCTAAGAACA 

SEQ1703 CAGTGTACGCAGCTCAAGCATTGGGCGTACCAATGATATTTGCTAAAAAAGCTAAGAACA 

SEQ1704 CAGTGTACGCAGCTCAAGCATTGGGCGTACCAATGATATTTGCTAAAAAAGCTAAGAACA 

SEQ17 05 CAGTGTACGCAGCTCAAGCATTGGGCGTACCAATGATATTTGCTAAAAAAGCTAAGAACA 

SEQ1706 CAGTGTACGCAGCTCAAGCATTGGGCGTACCAATGATATTTGCTAAAAAAGCTAAGAACA 

SEQ1707 CAGTGTACGCAGCTCAAGCATTGGGCGTACCAATGATATTTGCTAAAAAAGCTAAGAACA 

SEQ1708 CAGTGTACGCAGCTCAAGCATTGGGCGTACCAATGATATTTGCTAAAAAAGCTAAGAACA 

SEQ170 9 CAGTGTACGCAGCTCAAGCATTGGGCGTACCAATGATATTTGCTAAAAAAGCTAAGAACA 

SEQ1710 CAGTGTACGCAGCTCAAGCATTGGGCGTACCAATGATATTTGCTAAAAAAGCTAAGAACA 

SEQ1711 CAGTGTACGCAGCTCAAGCATTGGGCGTACCAATGATATTTGCTAAAAAAGCTAAGAACA 



2 




% 



an 



?M ;.irn' 



TaUe 17: Comparative Sequences relating t SAG1086 (xanthine phophoribosyltransferase) 



SEQ1701 
SEQ1702 
SEQ1703 
SEQ1704 
SEQ1705 
SEQ1706 
SEQ1707 
SEQ1708 
SEQ1709 
SEQ1710 
SEQ1711 

SEQ1701 
SEQ1702 
SEQ1703 
SEQ1704 
SEQ1705 
SEQ1706 
SEQ1707 
SEQ1708 
SEQ1709 
SEQ1710 
SEQ1711 

SEQ1701 
SEQ1702 
SEQ1703 
SEQ1704 
SEQ1705 
SEQ1706 
SEQ1707 
SEQ1708 
SEQ1709 
SEQ1710 
SEQ1711 

SEQ1701 
SEQ1702 
SEQ1703 
SEQ1704 
SEQ1705 
SEQ1706 
SEQ1707 
SEQ1708 
SEQ1709 
SEQ1710 
SEQ1711 

SEQ1701 
SEQ1702 
SEQ1703 
SEQ1704 
SEQ1705 
SEQ1706 
SEQ1707 
SEQX708 
SEQ1709 
SEQ1710 
SEQ1711 



TTACT ATGACTGAAGGT ATCTTAACTGCTGAAGTGT ATTCTT T T ACAAAGCAAGWT ACGA , 

TTACTATGACTGAAGGTATCTTAACTGCTGAAGTGTATTCTTTTACAAAGCAAGTTACGA 

TTACTATGACTGAAGGTATCTTAACTGCTGAAGTGTATTCTTTTACAAAGCAAGTTACGA 

TTACTATGACTGAAGGTATCTTAACTGCTGAAGTGTATTCTTTTACAAAGCAAGTTACGA 

TTACTATGACTGAAGGTATRTTAACTGCTGAAGTGTATTCTTTTACAAAGCAAGTTACGA 

TTACTATGACTGAAGGTATCTTAACTGCTGAAGTGTATTCTTTTACAAAGCAAGTTACGA 

TTACTATGACTGAAGGTATCTTAACTGCTGAAGTGTATTCTTTTACAAAGCAAGTTACGA 

TTACTATGACTGAAGGTATCTTAACTGCTGAAGTGTATTCTTTTACAAAGCAAGTTACGA 

TTACTATGACTGAAGGTATCTTAACTGCTGAAGTGTATTCTTTTACAAAGCAAGTTACGA 

TTACTATGACTGAAGGTATCTTAACTGCTGAAGTGTATTCTTTTACAAAGCAAGTTACGA 

TTACT AT GACTGAAGGTATCT TAACTGCTGAAG TGT ATTCTTTTACAAAGCAAGTT ACGA 

GTCAAGTTTCTATTGTGAGTCGCTTTTTATCTAACGATGATACTGTACTCATCATTGATG 
GTCAAGTTTCTATTGTGAGTCGCTTTTTATCTAACGATGATACTGTACTCATCATTGATG 
GTCAAGTTTCTATTGTGAGTCGCTTTTTATCTAACGATGATACTGTACTCATCATTGATG 
GTCAAGTTTCTATTGTGAGTCGCTTTTTATCTAACGATGATACTGTACTCATCATTGATG 
GTCAAGTTTCTATTGTGAGTCGCTTTTTATCTAACGATGATACTGTACTCATCATTGATG 
GTCAAGTTTCTATTGTGAGTCGCTTTTTATCTAACGATGATACTGTACTCATCATTGATG 
GTCAAGTTTCTATTGTGAGTCGCTTTTTATCTAACGATGATACTGTACTCATCATTGATG 
GTCAAGTTTCTATTGTGAGTCGCTTTTTATCTAACGATGATACTGTACTCATCATTGATG 
GTCAAGTTTCTATTGTGAGTCGCTTTTTATCTAACGATGATACTGTACTCATCATTGATG 
GTCAAGTTTCTATTGTGAGTCGCTTTTTATCTAACGATGATACTGTACTCATCATTGATG 
GTCAAGTTTCTATTGTGAGTCGCTTTTTATCTAACGATGATACTGTACTCATCATTGATG 

ACTTTTTAGCAAACGGTCAAGCGGCTAAAGGATTACTTGAAATT-ATTGGTCAAGCTGGA 
ACTTTTTAGCAAACGGTCAAGCGGCTAAAGGATTACTTGAAATT-ATTGGTCAAGCTGGA 
ACTTTTTAGCAAACGGTCAAGCGGCTAAAGGATTACTTGAAATT-ATTGGTCAAGCTGGA 
ACTTTTTAGCAAACGGTCAAGCGGCTAAAGGATTACTTGAAATT-ATTGGTCAAGCTGAA 

ACTTTTTAACAAACGGTCAAGC 

ACTTTTTAGCAAACMGTCYAGCGGCTAAAGGATTACTTGAAATT-ATTGGTCAAGCTGGA 
ACTTTTTAGCAAACGGKCAAGCGGSTAAAGGATTACTTGAAATT-ATTGGTCAAGCTGGA 
ACTTTTTAGCAAACGGTCAAGCGGCTAAAGGATTACTTGAAATT-ATTGGTCAAGCTGAA 
ACTTTTTAGCAAACGGTCAAGCGGCTAAAGGATTACTTGAAATTTATTGGTCAAGCTGGA 
ACTTTTTAGCAAACGGTCAAGCGGCTAAAGGATTACTTGAAATT-ATTGGTCAAGCTGGA 
ACTTTTTAGCAAACGGTCAAGCGGCTAAAGGATTACTTGAAATT-ATTGGTCAAGCTGGA 

CTAAGGTTGCTGGTATCGGAATCGTTATTGAAAAATCTTTCCAAGATGGGCGTGATTTG 
CTAAGGTTGCTGGTATCGGAATCGTTATTGAAAAATCTTTCCAAGATGGGCGTGATTTG 
CTAAGGTTGCTGGTATCGGAATCYTTATTGAAAAATCTTTCCAAGATGGGCGTGATT — 
CTAAGGTTGCTGGTATCGGAATCGTTATTGAAAAATCTTTCCAAGATGGGCGTGATTTG 

CTAAGGTTGCTGGTATCGGAATCGTTATTGAAAAATCTTTCCAAGATGGGCGTGATTTG 

CTA- — 

CTAAGGTTGCTGGTATCGGAATCGTTATTGAAAAATCTTTCCAAGATGGGCGTGATTTG 

CTAAGGTTGCTGGTATCGGAATCGTTATTGAAAAATCTTTCCAAGATGGGCGTGATTTG 

CTAAGGTTGCTGGTATCGGAATCGTTATTGAAAAATCTTTCCAAGATGGGCGTGATTTG 

CTAAGGTTGCTGGTATCGGA TABCMARATVSTNCSR — ATNGTSAGXANTHN 

TAGAAAAAACAGGTGTTCCAGT 

TAGAAAAAACA 

TAGAAAAAACAGGTGTTCCGGTTACTTCTCTTGCTCGT 
TAGAAAA 

TAGAAAAAACAGGTGTTCCGGTTAC 

TAGAAAAAACAGGTGTTCCAGT 

TAGAAAAAACAGGTGTTCCAG 

HRBSYTRANSRAS 
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Table 18: Comparativ Sequences relating t SAG1600 (glutamate racemase) 




CPA m NO 1802- SAG1600 FROM THE M732 GBS TYPE III STRAIN (REVERSE COMPLEMENT) 

SStgtt^tca^tccagaagaggaagt^ 

™MTTTACCTGG^^ 

S^SStaaa^^ 

CTATCCCTOGCTTGTCCGAAATTTGTTCCAATTGTGGAAT 

AATAAACACGGTGGTCATCACTTTTACACAACCGCCAGCCCAAAAGGTTTTAAAGAAA 

SEO ID HO. 1803: SAGX600 FROM THE 090 GBS TYPE la STRAIN „„„„„„ 

ARTCTTCATTGGAGACCAGGCTAGAGCTCCGTATG 

CTGTOTTAG^CGTTAOTTTACCAGGAGCTAGCGCAGCTATCA 

gSaaatSgS^^^ 
tSg^tcaamcagatgt^cttcta^tta^ 

otttaggttgcacgcattatccctta^ 

cgscagcccaaaaggtttttaaggaaattgcagaacaatggcttaatcaagaaataaat 

o vr% Tn »n -ion a . SAG1600 FROM THE A909 GBS TYPE la STRAIN 

gcggttgtgtaaaagtgatgaccac 

ggtttctgcgccaI^ 

aaattaaacta^^ 

mw[ttggaaca1Stttcggacaagcaagggataccaca 

A^ScAT^GAGTACCTATA^ACCAACTTTCCCTAAATTA^ 

mG^GATCTCTAGTTTTTCTTTAAM 

MTW^GAAGiroAMCATC 

GATTACTTCCTCTTCTGGAAGTTGACGGAACATTTCCTTAACAACCGTTAAACCACCT 

SEO in MO 1805- SAG1600 FROM THE COH1 GBS TYPE la STRAIN 

TTCCGTCAACTTCCAAAATATGAAGTAATCTTCATTGGAGATCAGGCTAGAGCTCCGTATGGTCCT 

SaSaaagS^ 

cttSIttataggt^tcccatgactgttaaatcagatgctt^^ 
ccttgcttgtccgaaat 

«seo in mo 1806- SAG1600 FROM THE CJB110 GBS NONTYPEABLE STRAIN 

gtamcttcattggac^tcaggct^^ 

TAC 

SEQ ID NO. 1807: SAG1600 FROM THE 1169NT1 GBS TYPE V STRAIN 

CTTTTGGGCTGGCGGTTGTGTAAAATTGATGACCACCGTGTTTA 

CTTTTGG^^C^falttoi^i«A^i^«i"« CGTAATAGGGGATAATG 



CGTGCAACCTAAAATTAAAGTATCTAATTTACCAACTAATGGGGACAATGTTTCATAAACCACCTTT^ 
rzn ncnnmri a^VTA ar aCTP ATfSRRAGTACCTATAA 



GCATCTGATTTAACAGTCATGGGAGTACCTATAA 
«lFO TD NO 1808- SAG1600 FROM THE 1169NT1 GBS TYPE V STRAIN 

gtLtcttc^ttggg^tcaggct^^ 

cttattgactaaaaatgttaagatgattgttatagcttgtaatacagcaactgcagtt 



SEQ ZD NO. 1809: SAG1600 FROM THE 18RS21 GBS TOTE II STRAIN 

GAAATGTTCCGTCAACTTCCAGAAGAGGAAGTAATCTTCATTGGA 

gggaaagttggtattataggtactcccatgactc^ 
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Table 18: Comparative Sequences relating to SAG1600 (glutam 



tamfWei 



racemase) 



SEQTD NO. 1810: SAG1600 FROM THE 18RS21 TYPE II STRAIN 

ATTTCTTTAAAACCTTTTGGGCTGGCGGTTGTGTAATATTGATGACCACCGTGTTTATT^ 

CAATA/^AACAGAAATATC^CGAACGGTTTCTGCGCCACTATCAATTAATTTAACCTCAGCCCCCATAACATTO 

ATATGGGATAATGCGTGCAACCTAAAATTAAAGTA 

SEQ ID NO. 1811: > SAG1600 FROM THE 2603 V/R GBS TYPE V STRAIN 

ATTTCTTTAAAACCTTTTGGGCTGGCGGTTGTGTAATAAGTGATGACCACCGTGTTTATTTTGCCAATTATGGTTTATCTCAAAATAGT 
TCAATA/VAACAGAAATATC^CGAACGGTTTCTGCGCC^CTATCAATTAATTTAACCTCAGCCCCGATAACATTTTO 
AATAGGGGATAATGCGTGCAACCTAAAATTAAAGTATCTAATTTACCAACT 
ACTAGAAGACATCTGATTTGATTCCACAATTGGAACAA 

SEQ ID NO, 1812: SAG1600 FROM THE M781 GBS TYPE III STRAIN 

GGCGGTTGTGTAAAAGTGATGACCACCGTGTTTATTTTGCCAATTATGGTTTATCT 

CGGTTTCTGCGCCACTATCAATTAATTTAACCTCAGCCCCCATAACATTTTGAATGATGGGACGTAATAGGGGATAATGCGTGCAACCT 
AAAATTAAAGTATCTAATTTACCAACTAATGGGGACAACGTTTCATAAACCACCTTTTTGGCTAAACTAGAAGA 

SEQ ID NO. 1813: SAG1600 FROM THE M 781 GBS TYPE III STRAIN 

AATCTTCATTGGAGATCAGGCTAGAGCTCCGTATGGTCCTAGACCTGCTCAACAGATTAGAGAGTTTACCTGGCAGATGGTTAACTTCT 
TATTGACTAAAAATGTTAAGATGATTGTTATAGCTTGTAATACAGCAACTGC 

SEQ ID NO. 1814: SAG1600 FROM THE JM9130013 GS TYPE VIII STRAIN 

TGGGCTGGTOGTTGTGTAAAAGTGATGACCACCGTGTTTATTOT 
CACGAACGGTTTCTGCGCCACTATCAATTAATTTAACCTCAGCCCCCATAACA^ 

CAACCTAAAATTAAAGTATCTAATTTACCAACTAATGGGGACAACGTTTCATAAACCACCTTTTTGGCTAAACTAGAAGAC^ 

TGATTCCACAATTGGAACAAATTTCGGACAAGCAAGGGATACCACAGCAGTATTTGGM 

CTGATTTAACAGTCATGGGAGTACCTATAATACCAACTTTCCCTGAA 



SEQ1801 
SEQ1802 
SEQ1803 
SEQ1804 
SEQ1805 
SEQ1806 
SEQ1807 
SEQ1808 
SEQ1809 
SEQ1810 
SEQ1811 
SEQ1812 
SEQ1813 
SEQ1814 

SEQ1801 
SEQ1802 
SEQ1803 
SEQ1804 
SEQ1805 
SEQ1806 
SEQ1807 
SEQ1808 
SEQ1809 
SEQ1810 
SEQ1811 
SEQ1812 
SEQ1813 
SEQ1814 



AATCTTCATTGGAGATCAGGCTAGAGCT 

AAATGTTCCGTCAACTTCCAGAAGAGGAAGTAATCTTCATTGGAGATCAGGCTAGAGCT 

AATCTTCATTGGAGACCAGGCTAGAGCT 

GCGGTTGTGTAAAAG-T 

TTCCGTCAACTTCCAAAATATGAAGTAATCTTCATTGGAGATCAGGCTAGAGCT 

- GTAATCTTCATTGGAGATCAGGCTAGAGCT 

CTTTTGGGCTGGCGGTTGTGTAAAAT-T 

GTAATCTTCATTGGGGATCAGGCTAGAGCT 

AAATGTTCCGTCAACTTCCAGAAGAGGAAGTAATCTTCATTGGAGATCAGGCTAGAGCT 

ATTTCTTTAAAACCTTTTGGGCTGGCGGTTGTGTAATAT-T 

ATTTCTTTAAAACCTTTTGGGCTGGCGGTTGTGTAATAAGT 

GGCGGTTGTGTAAAAG-T 

■ AATCTTCATTGGAGATCAGGCTAGAGCT 

TGGGCTGGCGGTTGTGTAAAAG-T 

CGTATGGTC-CTAGACCTGCTCAACAGATTAGAGAGTTTACCTGGCAGATGGTTAATTT 
CGTATGGTC-CTAGACCTGCTCAACAGATTAGAGAGTTTACCTGGCAGATGGTT/^ACTT 
CGTATGGTC-CTAGACCTGCTCAAC^GATTAGAGAGTT-ACCTGGCAGATGGTTAATTT 

— GATGACCACCGTGTTTAT TT TGCCAATTATGG — TTTATCTCA-AAATAGTTCA 

CGTATGGTC-CTAGACCTGCTCAACAGATTAGAGAGTTTACCTGGCAGATGGTTAACTT 
CGT ATGGTC-CTAGACCTGCTCAACAGAT TAGAGAGTT T ACCTGG CAGATGGTTAATTT 

— GATGACCACCGTGTTTATTTTGCCAATTATGG — TTTATCTCA-AAATAGTTCA 

CGTATGGTC-CTAGACCTGCTCAACAGATTAGAGAGTTTACCTGGCAGATGGTTAATTT 
CGTATGGTC-CTAGACCTGCTCAACAGATTAGAGAGTTTACCTGGCAGATGGTTAACTT 

— GATGACCACCGTGTTTATTTTGCCAATTATGG — TTTATCTCA-AAATAGTTCA 

— GATGACCACCGTGTTTATTTTGCCAATTATGG — TTTATCTCA-AAATAGTTCA 

— GATGACCACCGTGTTTATTTTGCCAATTATGG — TTTATCTCA-AAATAGTTCA 

CGTATGGTC-CTAGACCTGCTCAACAGATTAGAGAGTTTACCTGGCAGATGGTTAACTT 
— GATGACCACCGTGTTTATTTTGCCAATTATGG — TTTATCTCA-AAATAGTTCA 
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Table 18: Comparative S quences relating to SAG1600 (glutama 



* 



cemase) 



SBQI801 
SEQ1802 
SEQ1803 
SEQ1804 
SEQ1805 
SEQ1806 
SEQ1807 
SEQ1808 
SEQ1809 
SEQ1810 
SEQ1811 
SEQ1812 
SEQ1813 
SEQ1814 

SEQ1801 

SEQ1802 

SEQ1803 

SEQ1804 

SEQ1805 

SEQ1806 

SEQ1807 

SEQ1808 

SEQ1809 

SEQ1810 

SEQ1811 

SEQ1812 

SEQ1813 

SEQ1814 

SEQ1801 

SEQ1802 

SEQ1803 

SEQ1804 

SEQ1805 

SEQ1806 

SEQ1807 

SEQ1808 

SEQ1809 

SEQ1810 

SEQ1811 

SEQ1812 

SEQ1813 

SEQ1814 

SEQ1801 

SEQ1802 

SEQ1803 

SEQ1804 

SEQ1805 

SEQ1806 

SEQ1807 

SEQ1808 

SEQ1809 

SEQ1810 

SEQ16X1 

SEQ1812 

SEQ1813 

SEQ1814 



TTATTGACTAAAAATGTTAAGATGATTGTTATAGCTTGTAATACAGCAACTGCAGTTGC 
TTATTGACTAAAAATGTTAAGATGATTGTTATAGCTTGTAATACAGCAACTGCAGTTGC 
TTATTGACTAAAAATGTTAAGATGATTGTTATAGCTTGTAATACAGCAACTGCAGTTGC 
- - ATAAAACAGAAAT ATCACGAACGGT -TTCTGCGCCACTATCAATTAATTTAACCTCA 
TTATTGACTAAAAATGTTAAGATGATTGTTATAGCTTGTAATACAGCAACTGCAGTTGC 
TTATTGACTAAAAATGTTAAGATGATTGTTATAGCTTGTAATACAGCAACTGCAGTTGC 

ATAAAACAGAAATATCACGAACGGT-TTCTGCGCCACTATCAATTAATTTAACCTCA 

TTATTGACTAAAAATGTTAAGATGATTGTTATAGCTTGTAATACAGCAACTGCAGTT— 
TTATTGACTAAAAATGTTAAGATGATTGTTATAGCTTGTAATACAGCAACTGCAGTTGC 

ATA? ^(^GAAATATCACGAACGGT-TTCTGCGCCACTATC^TTAATTT7^CCTCA 

—ATT^AAACAGAAATATCACGAACGGT-TTCTGCGCCACTATCAATTAATTTAACCTCA 

ATAAAACAG AAATATCACGAACGGT - TTCTGCGCCACT AT CAATTAATT TAACCTCA 

TTATTGACTAAAAATGTTA&GATGATTGTTATAGCTTGTAATACAGCAACTGC 

--ATAAAACAGAAATATCACGAACGGT-TTCTGCGCCACTATCAATTAATTTAACCTCA 

TGGCAAGAAATTAAAGAAAAACTAGACGTGCCTGTTTTAGGCGTTATTTTACCAGGAGC 
TGGCAAGAAATTAAAGAAAAACTAGACATCCCTGTTTTAGGCGTTATTTTACCAGGAGC 
TGGCAAGAAATTAAAGAAAAACTAGACATACCTGTTTTAGGCGTTATTTTACCAGGAGC 
CCCCCATAACAT TTT GAAT GATGGG ACGT AAT AGGGGATAATGC-GTGCAACCTAAAAT 
TGGCAAGAAATTAAAGAAAAACTAGACATCCCT GTTTTAGGCGTT AT TT TACCAGG AGC 

TGGCAAGAAATTAAAGAAAAACTAGACATAC ; 

CCCCCATAACATTTTGAATAATGGGACGTAATAGGGGATAATGC-GTGCAACCTAAAAT 

TGGCAAGAAATTAAAGAAAAACTAGACATCCCTGTTTTAG 

CCCCCATAACATTTTGAATGATGGGACGTAATATGGGATAATGC-GTGCAACCTAAAAT 
CCCCCATAACATTTTGAATGATGGGACGTAATAGGGGATAATGC-GTGCAACCTAAAAT 
CCCCCATT^ACATTTTGAATGATGGGACGTl^TAGGGGATAATGC-GTGCAACCTAAAAT 

CCCCCATAACATTTTGAATGATGGGACGTAATAAGGGATAATGC-GTGCAACCTAAAAT 

AGCGC AGCT ATCAAATCAACTAATT CAGGGAAAGT TGGT ATTAT AGGT ACT CCCATGAC 
AGCGCAGCTATCAAATCAACTAATTTAGGGAAAGTTGGTATTATAGGTACTCCCATGAC 
AGCGCAGCTATCAAATCAACTAATTCAGGGAAAGTTGGTATTATAGGTACTCCCATGAC 
AAAGTATCTAATTTACCAACTAATGGGGACAACGTTTCATAAACCACCTTTTTGGCTAA 
AGCGCAGCTATCAAATCAACTAATTTAGGGAAAGTTGGTATTATAGGTACTCCCATGAC 

AAAGTATCTAATTTACCAACTAATGGGGACAATGTTTCATAAACCACCTTTTTGGCTAA 

AG CGC AGCT AT CAAATC AACTAAT T T AGGGAAAGTT GGT ATT AT AG GT ACT CCCAT G AC 

AAAGTA — — — — — — -~ ~ ~ 

AflAGTATCTAATTTACCAACTAATGGGGACAACGTTTCATAAACCACCTTTTTGGCTAA 

AAAGTATCTAATTTACCAACTAATGGGGACAACGTTTCATAAACCACCTTTTTGGCTAA 

AAAGTATCTAATTTACCAACTAATGGGGACAACGTTTCATAAACCACCTTTTTGGCTAA 

GTTAAATCAGATGCTTATCGTCAAAAAATTCAAGCTTTGTCTCCAAATACTGCTGTGGT 
GTTAAATCAGATGCTTATCGTCAAAARATTCAAGCTTTGTCTCCAAATACTGCTGTGGT 
GTTAAATCAGATGCTTATCGTCAAAAAATTCAAGCTTTGTCTCCAAATACTGCTGTGGT 
CTAGAAGACATCTGATTTGATTCCACAATTGGAACAAATTTCGGACAAGCAAGGGATAC 
GTTAAATCAGATGCTTATCGTCAAAAAATTCAAGCTTTGTCTCCAAATACTGCTGTGGT 

CTAGAAGACATCTGATTTGATTCCACAATTGGAACAAATTTCGGACAAGCAAGGGATAC 
GTTAAATCAGATGCTTATCGTCAAAAAATTCAAGC ~ 

CTAGAAGACATCTGATTTGATTCCACAATTGGAACAA " 

CTAGAAGA I~"~""ZZ_ 

CTAGAAGACATCTGATTTGATTCCACAATTGGAACAAATTTCGGACAAGCAAGGGATAC 
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Table 18: Comparative Sequences relating to SAG1600 (glutai 



taimTO i 



racemase) 



SEQXBOl 
SEQ1802 
SEQ1803 
SEQ1804 
SEQ1805 
SEQ1806 
SEQ1807 
SEQ1808 
SEQ1809 
SEQ1810 
SEQ1811 
SEQ1812 
SEQ1813 
SEQ1814 

SEQ1801 
SEQ1802 
SEQ1803 
SEQ1804 
SEQ1805 
SEQ1806 
SEQ1807 
SEQ1808 
SEQ1809 
SEQ1810 
SEQ1811 
SEQ1812 
SEQ1813 
SEQ1814 

SEQ1801 
SEQ1802 
SEQ1803 
SEQ1804 
SEQ1805 
SEQ1806 
SEQ1807 
SEQ1808 
SEQ1809 
SEQ1810 
SEQ1811 
SEQ1812 
SEQ1813 
SEQ1814 

SEQ1801 
SEQ1802 
SEQ1803 
SEQ1804 
SEQ1805 
SEQ1806 
SEQ1807 
SEQ1808 
SEQ1809 
SEQ1810 
SEQ181X 
SEQ1812 
SEQ1813 
SEQ1814 



TCCCTTGCTTGTCCGAAATTTGTTCCAATTGTGGAATCAAATCAGATGTCTTCTAGTTT 
TCCCTTGCTTGTCCGAAATTTGTTCCAATTGTGGAATCAAATCAGATGTCTTCTAGTTT 
TCCCTTGCTTGTCCGAAATTTGTTCCAATTGTGGAATCAAATCAGATGTCTTCTAGTTT 
ACAGCAGTATTTGGAGACAAAGCTTGAATTTTTTGACGATAAGCATCTGATTTAACAGT 
TCCCTTGCTTGTCCGAAAT 

AC AGCAGTATTTGGAGACAAAGCT TGAATTTTTTGACGATAAGCATCTGAT TTAACAGT 

ACAGCAGTATTTGGAGACAAAGCTTGAATTTTTTGACGATAAGCATCTGATTTAACAGT 

GCCAAAAAGGTGGTTTATGAAACGTTGTCCCCATTAGTTGGTAAATTAGATACTTTAAT 
GCCAAAAAGGTGGTTTATGAAACGTTGTCCCCATTAGTTGGTAAATTAGATACTTTAAT 
GCCAAAAAGGTGGTTTATGAAACGCTGTCCCCATTAGTTGGTAAATTAGATACTTTAAT 
ATGGGAGTACCTATAATACCAACTTTCCCTAAATTAGTTGATTTGATAGCTGCGCTAGC 



ATGGG AGTACC T ATAA- 



ATGGGAGTACCTATAATACCAACTTTCCCTGAATABCMARATVSTNCSRATNGTSAGGT 

TTAGGTTGCACGCATTATCCCTTATTACGTCCCATCATTCAAAATGTTATGGGGGCTGA 
TTAGGTTGCACGCATTATCCCCTATTACGTCCCATCATTCAAAATGTTATGGGGGCTGA 
TTAGGTTGCACGCATTATCCCTTATTACGTCCCATCATTCAAAATGTTATGGGGGCTGA 
CCTGGTAAAATAACGCCTAAAACAGGGATGTCTAGTTTTTCTTTAATTTCTTGCCAGGC 



AMATRACMAS 

GTTAAATTAATTGATAGTGGCGCAGAAACCGTTCGTGATATTTCTGTTTTATTGAACTA 
GTTAAATTAATTGATAGTGGCGCAGAAACCGTTCGTGATATTTCTGTTTTATTGAACTA 
GTTAAATTAATTGATAGTGGCGCAGAAACCGTT CGTGAT AT TT CT GTTTTATTGAACTA 
ACTGCAGTTGCTGTATTACAAGCTATAACAATCATCTTAACATTTTTAGTCAATAAGAA 



SE^^Wl 

SEQ1802 
SEQ1803 
SEQ1804 
SEQ1805 
SEQ1806 
SEQ1807 
SEQ1808 
SEQ1809 
SEQ1810 
SEQ1811 
SEQ1812 
SEQ1813 
SEQ1814 

SEQ1801 
SEQ1802 
SEQ1803 
SEQ1804 
SEQ1805 
SEQ1806 
SEQ1807 
SEQ1808 
SEQ1809 
SEQ1810 
SEQ1811 
SEQ1812 
SEQ1813 
SEQ1814 

SEQ1801 
SEQ1802 
SEQ1803 
SEQ1804 
SEQ1805 
SEQ1806 
SEQ1807 
SEQ1808 
SEQ1809 
SEQ1810 
SEQ1811 
SEQ1812 
SEQ1813 
SEQ1814 



Table 18: Compa 



t 

rativc 



e Sequences relating to SAG1600 (gluta 



tamme 



racemase) 



TTTGAGATAAACCATAATTGGCAAAATAAACACGGTGGTCATCACTTTTACACAACCGC 
TTTGAGATAAACCATAATTGGCAAAATAAACACGGTGGTCATCACTTTTACACAACCGC 
TTTGAGATAAMCCATAATTGGSMAAATAAACACGGTGGTCATCACTTTTACACAACCGS 
TTAACCATCTGCCAGGTAAACTCTCTAATCTGTTGAGCAGGTCTAGGACCATACGGAGC 

AGCCCAA 

AGCCCAAAAGGTTTTAAAGAAA 

AGCCCAAAAGGT TTT T AAGGAAATTGCAGAACAATGGCTT7UVTCAAGAAATAAAT 

CTAGCCTGATCTCCAATGAAGATTACTTCCTCTTCTGGAAGTTGACGGAACATTTCCTT 

ACAACCGTTAAACCACCT 



5 



0j^ble 19: C mparative Sequences relating to SAG1680 (shikimate s5ehydr genase) 

SEQ ID NO. 1901: SAG1680 FROM THE 2603 V/R GBS TYPE V STRAIN 

ATCCCTAGACCATTATAAGCATGTTTCACTCCATTTTGTCTAACAAATCGTAACAATGCTGTTTCTTTAGGCTTGTAAAC 
CAAGTCGACAACTACTAAATTCGGTGTTAAAATTTCTGGATCGTTAATTAAACTATAATTATCTAATGGCCTCATTCCTA 
AACTAGTAGCATCAATATAAAAATGACTAGTTCTAATAGCGTCTTTAAATGCTGTCTTATTTTCTAGATAATCAACGACT 
ACCTTTATTTGAAACTGTTTTTTAATTTTATCTGATAAGTCAATGACCTTATCGTAATTTGAGCTGTTACGATTAAATAA 
TCTAATTTCCGCAACTCCCTCCATAGCTGCTTGAACTGCAACTGCTTTACCTGAACCACC7^ATACCAGCTATTGTAATTA 
TTTTATTTTTAGCACTGAAACCTTGAGCTGCTAAAGCTTTAAAACAACCAATGCCATCTGTCATATGGCCTACTAAACGT 
CCGGT TCCACCTT GATT AACGATAGT AT TT ACAGCACCCACTAATT TAGCTTGAGGAGATAAATCATCTAGCAAAGGGAT 
AACACTCTGTTTAAATGGCATTGA7^ACATTAACACCACGAATACCCAATGCCCTGACACCTCGAACAGCTTCTGTTAATT 
TACCCTCTTCTACTTCAAATGTCAGATAGGCATAATTCATGTTTTTTTCTTGAAAAGAGGTATTCCACATTAACGGGGAT 
AGAGAGTGGCGTGCAGG 

SEQ ID NO. 1902: SAG1680 FROM THE H36b GBS TYPE lb STRAIN 

GTTATTAATTGAAATGCTTCTGCTCCTTGATAAATCAGCATCCCTAGACCATTATAAGCATGTTTCACTCCATTTTGTCT 
AACAAATCGTAACAATGCTGTTtCTTTAGGCTTGTAAACCAAGTCGACAACTACTAAATTCGGTGTTAAAATTTCTGGAT 
CGT TAAT TAAACTATAAT T ATCTAATGGCCTCAT TCCTAAACT AGTAGCAT CAATATAAAAATGACT AGT TCTAATAGCG 
TCTTTAAATGCTGTCTTATTTTCTAGATAATCAACGACTACCTTTATTTGAAACTGTTTTTTAATTTTATCTGATAAGTC 
AATGACCTTATCGTAATTTGAGCTGTTACGATTAAATAATCTAATTTCCGCAACTCCCTCCATAGCTGCTTGAACTGCAA 
CTGCTTTACCTGAACCACCAATACCAGCTATTGTAATTATTTTATTTTTAGCACTGAAACCTTGAGCTGCTAAAGCTTTA 
AAACAACCAATGCCATCTGTCATATGGCCTACTAAACGTCCGGTTCCACCTTGATTAACGATAGTATTTACAGCACCCAC 
TAATTTAGCTTGAGGAGATAAATCATCTAGCAAAGGGATAACACTCTGTTTAAATGGCATTGAAACATTAACACCACGAA 
TACCCAATGCCCTGACACCTCGAACAGCTTCTGTTAATTTACCCTCTTCTACTTCAAATGTCAGATAGGCATAATTCATG 
TTTTTTTCTTGAAAAGAGGTATTCCACATTAACGGGGATAGAGAGTGGCGTGCAGGA 

SEQ ID NO. 1903: SAjG1680 FROM THE M732 GBS TYPE III STRAIN 

CTGGTCTAATTGCCAATCCTGCACGCCACTCTCTATCCCCGTTAATGTGGAATACCTCTTTTCAAGAAAAAAACATGAAT 
TATGCCTATCTGACATTTGAAGTAGAAGAGGGTAAATTAACAGAAGCTGTTCGAGGTGTCAGGGCATTGAGTATTCGTGG 
TGTTAATGTTTCAATGCCATTTAAACAGAGTGTTATCCCTTTGCTAGATGATTTATCTCCTCAAGCTAAATTAGTGGGTG 
CTGTAAATACTATCGTTAATCAAGGTGGAACCGGACGTTTAGTAGGCCATATGACAGATGGCATTGGTTGTTTTAAAGCT 
TTAGCAGCTCAAGGTTTCAGTGCTAAAAATAAAATAATTACAATAGCTGGTATTGGTGGTTCAGGTAAAGCAGTTGCAGT 
TCAAGCAGCT ATGGAGGGAGTTGCGGAAAT T AGATTATTTAAT CGT AAC AGCTCAAAT T ACGATAAGGT CATTGACTT AT 
C^GATAAAATTAAAAAACAGTTTCA/^TAAAGGTAGTCGTTGATTATCTAGAAAATAAGACAGCATTTAAAGACGCTATT 
AGAACTAGTCATTTTTATATTGATGCTACTAGTTTAGGAATGAGGCCATTAGATAATTATAGTTTAATTAACGATCCAGA 
T ATT TT AACACCGAATT TAGTAGTTGTCGAC TT 

SEQ ID NO. 1904: SAG1680 FROM THE M781 GBS TYPE III STRAIN 

AAATCAGCATCCCTAGACATTATAAGCATGTTTCACTCCATTTTGTCTAACAAATCGTAACAATGCTGTTTCTTTAGGCT 
TGTAAACCAAGTCGACAACTACTAAATTCGGTGTTAAAATTTCTGGATCGTTAATTAAACTATAATTATCTAATGGCCTC 
AT TCCTAAACTAGTAGCATCAATATAAAAATGACTAGTTCT AAT AGCGT CT TTAAATGCT GTCTTATTTTCT AGATAAT C 
AACGACTACCTTTATTTGAAACTGTTTTTTAATTTTATCTGATAAGTCAATGACCTTATCGTAATTTGAGCTGTTACGAT 
TAAATAATCTAATTTCCGCAACTCCCTCCATAGCTGCTTGAACTGCAACTGCTTTACCTGAACCACCAATACCAGCTATT 
GTAATTATTTTATTTTTAGCACTGAAACCTTGAGCTGCTAAAGCTTTAAAACAACCAATGCCATCTGTCATATGGCCTAC 
TAAACGTCCGGTTCCACCTTGATTAACGATAGTATTTACAGCACCCACTAATTTAGCTTGAGGAGATAAATCATCTAGCA 
AAGGGATAACACTCTGTTTAAATGGCATTGAAACATTAACACCACGAATACTCAATGCCCTGACACCTCGAACAGCTTCT 
GTTAATTTACCCTCTTCTACTTCAAATGTCAGATAGGCATAATTCATGTTTTTTTCTTGAAAAGAGGTATTCCACATTAA 
CGGGGATAGAGAGTGGCGTGCA 

SEQ ID NO. 1905: SAG1680 FROM THE 090 GBS TYPE la STRAIN 

GTTCGAGGTGTCAGGGCATTGGGTATTCGTGGTGTTAATGTTTCAATGCCATTTAAACAGAGTGTTATCCCtTTGCTArA 
TGATTTATCTCCTCAAGCTAAATTAGTGGGTGCTGTAAATACTATCGTTAATCAAGGTGGAACCGsACGTTTAGTAGGCC 
ATATGACAGATGGCATTGGTTGTTTTAAAGCTTTAGCAGCTCAAGGTTTCAGTGCTAAAAATAAAATAGTTACAATAGCT 
GGTATTGGTGGTTCAGGTAAAGCAGTTGCAGTTCAAGCAGCTATGGAGGGAGTTGCGGAAATTAGATTATTTAATCGTAA 
TAGCTGAAATTACGATAAGGTCATTGACTTATCAGATAAAATTAAAAAACAGTTTCAAATAAAGGTAGTCGTTGATTATC 
TAGAAAATAAGACAGCATTTAAAGACGCTATTAGAACTAGTCATTTTTATATTGATGCTACTAGTTTAGGAATGArGCCA 
TTAGATAATTATAGTTTAATTAACGATCCAGAAATTTTAACACCCAATTTAGTAGTTGTCGACTTGGTTTACAAGCCTAA 
AGAAACAGCATTGTTACGATTTGTTAGACAAAATGGAGTGAAACATGCTTATAATGGTCTAGGGATGCTGATTTATCAAG 
GAGCAGA 
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£* 14 M it £x £* ^ - -O Lt > - SJ 2: 
^^ble 19: C mparative Sequences relating to SAG1680 (shikimate S^ehydr genase) 

SEQ ID NO. 1906: SAG1680 FROM THE A909 GBS TYPE la STRAIN 

CCCTAGACCATTATAATCATGTTTCACTCCATTTTGTCTAACAAATCGTAACAATGCTGTTTCTTTAGGCTTGTAAACCA 

AGTCGACAACTACTAAATTCGGTGTTAAAATTTCTGGATCGTTAATTAAACTATAATTATCTAATGGCCTCATTCCTAAA * 

CTAGTAGCATCAATATAAAAATGACTAGTTCTAATAGCGTCTTTAAATGCTGTCTTATTTTCTAGATAATCAACGACTAC 

CTTTATTTGAAACTGTTTTTTAATTTTATCTGATAAGTCAATGACCTTATCGTAATTTGAGCTGTTACGATTAAATAATC 

TAATTTCCGCAACTCCCTCCATAGCTGCTTGAACTGCAACTGCTTTACCTGAACCACCAATACCAGCTATTGTAATTATT 

TTATTTTTAGCACTGAAACCTTGAGCTGCTAAAGCTTTAAAACAACCAATGCCATCTGTCATATGGCCTACTAAACGTCC 

GGTTCCACCTTGATTAACGATAGTATTTACAGCACCCACTAATTTAGCTTGAGGAGATAAATCATCTAGCA7VAGGGATAA 

CACTCTGTTT AAATGGCAT TGAAACATTAACACCACGAATACCCAAT GCCCTGACACCTCGAACAGCT TCTGTTAATT TA 

CCCTCTTCTACTTCAAATGTCAGATAGGCATAATTCATGTTTTTTTCTTGAAAAGAGGTATTCCACATTAACGGGGATAG 

SEQ ID NO. 1907: SAG1680 FROM THE COH1 GBS TYPE la STRAIN 

TGCACGCCACTCTCTATCCCCGTTAATGTGGAATACCTCTTTTAAGAAAAAAACATGAATTATGCCTATCTGACATTTGA 
AGTAGAAGAGGGTAAATTAACAGAAGCTGTTCGAGGTGTCAGGGCATTGAGTATTCGTGGTGTTAATGTTTCAATGCCAT 
TTAAACAGAGTGTTATCCCTTTGCTAGATGATTTATCTCCTCAAGCTAAATTAGTGGGTGCTGTAAATACT 

SEQ ID NO. 1908: SAG1680 FROM THE CJB110 GBS NONTYPEABLE STRAIN 

ATTCGTTATTAATTGAAATGCTTCTGCTCCTTGATAAATCAGCATCCCTAGACCATTATAAGCATGTTTCACTCCATTTT 
GTCTAACAAATCGTAACAATGCTGTTTCTTTAGGCTTGTAAACCAAGTCGACAACTACTAAATTGGGTGTTAAAATTTCT 
GGATCGTTAATTAAACTATAATTATCTAATGGCCTCATTCCTAAACTAGTAGCATCAATATAAAAATGACTAGTTCTAAT 
AGCGTCTTTAAATGCTGTCTTATTTTCTAGATAATCAACGACTACCTTTATTTGAAACTGTTTTTTAATTTTATCTGATA 
AGTCAATGACCTTATCGTAATTTGAGCTATTACGATTAAATAATCTAATTTCCGCAACTCCCTCCATAACTGCTTGAACT 
GCAACTGCTTTACCTGAACCACCAATACCAGCTATTGTAAGTATTTT 

SEQ ID NO. 1909: SA61680 FROM THE CJB110 GBS NONTYPEABLE STRAIN 

ACTCTCTATCCCCGTT AAT GTGGAATACCTCT TTT CAAGAAAAAAACATGAATTATGCCTATCTGACATTTGAAGTAGAA 
GAGGGTAAATTAACAGAAGCTGTTCGAGGTGTCAGGGCATTGGGTATTCGTGGTGTTAATGTTTCAATGCCATTTAAACA 
GAGTGTTATCCCTTTGCTAGATGATTTATCTCCTCAAGCTAAATTAGTGGGTGCTGTAAATACTATCGTTAATCAAGGTG 
GAACCGGACGTTTAGTAGGCCATATGACAGATGGCATTGGTTGTTTTAAAGCTTTAGCAGCTCAAGGTTTCAGTGCTAAA 
AATAAAATAGTTACAATAGCTGGTATTGGTG 
< 

SEQ ID NO. 1910: SA61680 FROM THE 1169NT1 GBS TYPE V STRAIN 

ATTCGTTATTAATTGAAATGCTTCTGCTCCTTGATAAATCAGCATCCCTAGACCATTATAAGCATGTTTCACTCCATTTT 
GTCTAACA7\ATCGTAACAATGCTGTTTCTTTAGGCTTGTAAACCAAGTCGACAACTACTAAATTCGGTGTTAAAATTTCT 
GGATCGTTAATTAAACTATAATTATCTAATGGCCTCATTCCTAAACTAGTAGCATCAATATAAAAATGACTAGTTCTAAT 
AGCGTCTTTAAATGCTGTCTTATTTTCTAGAT7VATCAACGACTACCTTTATTTGAAACTGTTTTTTAATTTTATCTGATA 
AGTCAATGACCTTATCGTAATTTGAGCTGTTACGAT 

SEQ ID NO. 1911: SAG1680 FROM THE 1169NT1 GBS TYPE V STRAIN 

ACTTCTCTATTCCCCGTTAATGTGGAATACCTCTTTTCAAGAAA7\AAACATGAATTATGCCTATCTGACATTTGAAGTAG 
AAGAGGGTAAATTAACAGAAGCTGTTCGAGGTGTCAGGGCATTGGGTATTCGTGGTGTTAATGTTTCAATGCCATTTAAA 
CAGAGTGTTATCCCTTTGCTAGATGATTTATCTCCTCAAGCTAAATTAGTGGGTGCTGTAAATACTATCGTTAATCAAGG 
TGGAACC 

SEQ ID NO. 1912: SAG1680 FROM THE 18RS21 GBS TYPE II STRAIN 

TCGTTATTAATTGAAATGCTTCTGCTCCTTGATAAATCATCATCCCTAGACCATTATAAGCATGTTTCACTCCATTTTGT 
CTAACAAATCGTAACAATGCTGTTTCTTTAGGCTTGTAAACCAAGTCGACAACTACTAAATTCGGTGTTAAAATTTCTGG 
ATCGTTAATTAAACTATAATTATCTAATGGCCTCATTCCTAAACTAGTAGCATCAATATAAAAATGACTAGTTCTAATAG 
CGTCTTTAAATGCTGTCTTATTTTCTAGATAATCAACGACTACCTTTATTTGAAACTGTTTTTTAATTTTATCTGATAAG 
TCAATGACCTTATCGTAATTTGAGCTGTTACGATTAAATAATCTAATTTCCGCAAC 

SEQ ID NO. 1913: SA61680 FROM THE 18RS21 GBS TYPE II STRAIN 

ATGCCTATCTGACATTTGAAGTAGAAGAGGGTAAATTAACAGAAGCTGTTCGAGGTGTCAGGGCATTGGGTATTCGTGGT 
GTTAATGTTTCAATGCCATTTT^AACAGAGTGTTATCCCTTTGCTAGATGATTTATCTCCTCAAGCTAAATTAGTGGGTGC 
TGTAAATACTATCGTTAATCAAGGTGGAACCGGACGTTTAGTAGGCCATATGACAGATGGCATTGGTTGTTTTAAAGCTT 
TAGCAGCTCAAGGTTTCAGTGCTAAAAATAAAATAATTACAATAGCTGGTATTGGTGGTTCAGGTAAAGCAGTTGCAGTT 
CAAGCAGCTATGGAGGGAGTTGCGG 
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*u & h - irij & >'^^ ti 3=35 £> V Is a± 
^piblel9: C mparative Sequences relating to SAG1680 (shikimate 5-dehydr genase) 

SEO ID NO 1914' SAG1680 FROM THE JM9130013 GBS TYPE VIII STRAIN 

CCCTAGACCATTATAAGTCATGTTTCACTCCATTTTGTCTAACAAATCGTAACAATGCTGTTTCTTTAGGCTTGTAAACC 
AAGTCGACAACTACTAAATTGGGTGTTAAAATTTCTGGATCGTTAATTAAACTATAATTATCTAATGGCCTCATTCCTAA 
^TAGTAG^TCAATATAAAAATGACTAGTTCTAATAGCGTCTTTAAATGCTGTCTTATTTTCTAGATAATCAACGACTA 
CCTTTATTTGAAACTGTTTTTTAATTTTATCTGATAAGTCAATGACCTTATCGTAATTTGAGCTATTACGATTAAATAAT 
CTAATTTCCGCAACTCCCTCCATAGCTGCTTGAACTGCAACTGCTTTACCTGAACCACCAATACCAGCTATTGTAACTAT 
TTTATTTTTAGCACTGAAACCTTGAGCTGCTAAAGCTTTAAAACAACCAATGCCATCTGTCAT 

ATCCCT 

Soi902 GTTATTAATTGAAATGCTTCTGCTCCTTGATAAATCAGCATCCCT 

SEO1903 TGGTCTAATTGCCAATCCTGCACGCCACTCTCTAT-CCCCGTTAATGTGGAATACCTCT 
Ssi904 ---- AAATCAGCATCCCT 

SEO1907 TGCACGCCACTCTCTAT-CCCCGTTAATGTGGAATACCTCT 

LlOiOOB ATTCGTTATTAATTGAAATGCTTCTGCTCCTTGATAAATCAGCATCCCT 

SEO1909 ACTCT CT AT -CCCCGTT AATGTGGAATACCTCT 

aTTCGTTATTAATTGAAATGCTTCTGCTCCTTGATAAATCAGCATCCCT 

SE01911 ACTTCTCTATTCCCCGTT AATGTGGAATACCTCT 

SEQ1912 TCGTTATTAATTGAAATGCTTCTGCTCCTTGATAAATCATCATCCCT 

SEQ1913 "IZ:~I~~"~jII~3~~IcCCT 

SEQ1914 

SEQI901 GACCATTATAAG— CATGTTTCACTCCATTTTGTCTAACAAATCGTAACAATGCTGTTTC 

SEO1902 GACCATTATAAG— CATGTTTCACTCCATTTTGTCTAACAAATCGTAACAATGCTGTTTC 

SEO1903 TTCAAGAAAAAAACATGAATTATGCCTATCTGACATTTGAAGTAGAAGAGGGTAAATTA 

SEQ1904 GAC-ATTATAAG— CATGTTTCACTCCATTTTGTCTAACAAATCGTAACAATGCTGTTTC 

SEQ1906 GACCATTATAAT— CATGTTTCACTCCATTTTGTCTAACAAATCGTAACAATGCTGTTTC 

SEO1907 TT-AAGAAAAAAACATGAATTATGCCTATCTGACATTTGAAGTAGAAGAGGGTAAATTA 

SEQ1908 GACCATTATAAG-CATGTTTCACTCCATTTTGTCTAACAAATCGTAACAATGCTGTTTC 

SEO1909 TTCAAGAAAAAAACATGAATTATGCCTATCTGACATTTGAAGTAGAAGAGGGTAAATTA 

SEQ1910 GACCATTATAAG-CATGTTTCACTCCATTTTGTCTAACAAATCGTAACAATGCTGTTTC 

SE01911 TTCAAGAAAAAAACATGAATTATGCCTATCTGACATTTGAAGTAGAAGAGGGTAAATTA 

SEQ1912 GACCATTATAAG-CATGTTTCACTCCATTTTGTCTAACAAATCGTAACAATGCTGTTTC 

SED1913 ATGCCTATCTGACATTTGAAGTAGAAGAGGGTAAATTA 

SEQ1914 GACCATTATAAGTCATGTTTCACTCCATTTTGTCTAACAAATCGTAACAATGCTGTTTC 

SEO1901 TTAGGCTTGTAAACCAAGTC — GACAACTACTAAATTCGGTGTTAAAATTTCTGGATCG 

SEO1902 TTAGGCTTGTAAACCAAGTC — GACAACTACTAAATTCGGTGTTAAAATTTCTGGATCG 

SEQ1903 CAGAAGCTGTTCGAGGTGTCAGGGCATTGAGTATTCGTGGTGTTAATGTTTCAATGCCA 

SEQ1904 TTAGGCTTGTAAACCAAGTC-— GACAACTACTAAATTCGGTGTTAAAATTTCTGGATCG 

SE Oi 9 05 GTTCGAGGTGTCAGGGCATTGGGTATTCGTGGTGTTAATGTTTCAATGCCA 

SEQ1906 TTAGGCTTGTAAACCAAGTC— GACAACTACTAAATTCGGTGTTAAAATTTCTGGATCG 

SEQ1907 CAGAAGCTGTTCGAGGTGTCAGGGCATTGAGTATTCGTGGTGTTAATGTTTCAATGCCA 

SEQ1908 TTAGGCTTGTAAACCAAGTC — GACAACTACTAAATTGGGTGTTAAAATTTCTGGATCG 

SEQ1909 CAGAAGCTGTTCGAGGTGTCAGGGCATTGGGTATTCGTGGTGTTAATGTTTCAATGCCA 

SEQ1910 TTAGGCTTGTAAACCAAGTC — GACAACTACTAAATTCGGTGTTAAAATTTCTGGATCG 

SEQ1911 CAGAAGCTGTTCGAGGTGTCAGGGCATTGGGTATTCGTGGTGTTAATGTTTCAATGCCA 

SEQ1912 TTAGGCTTGTAAACCAAGTC — GACAACTACTAAATTCGGTGTTAAAATTTCTGGATCG 

SEQ1913 CAGAAGCTGTTCGAGGTGTCAGGGCATTGGGTATTCGTGGTGTTAATGTTTCAATGCCA 

SEQ1914 TTAGGCTTGTAAACCAAGTC — GACAACTACTAAATTGGGTGTTAAAATTTCTGGATCG 
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ble 19: Comparative Sequences relating t SAG1680 (shikimate 



| 



*, s_t; s^L sL. -r 

hydrogenase) 



SEQ1901 

SEQ1902 

SEQ1903 

SEQ1904 

SEQ1905 

SEQ1906 

SEQ1907 

SEQ1908 

SEQ1909 

SEQ1910 

SEQ1911 

SEQ1912 

SEQ1913 

SEQ1914 

SEQ1901 

SEQ1902 

SEQ1903 

SEQ1904 

SEQ1905 

SEQ1906 

SEQ1907 

SEQ1908 

SEQ1909 

SEQ1910 

SEQ1911 

SEQ1912 

SEQ1913 

SEQ1914 

SEQ1901 
SEQ1902 
SEQ1903 
SEQ1904 
SEQ1905 
SEQ1906 
SEQ1907 
SEQ1908 
SEQ1909 
SEQ1910 
SEQ1911 
SEQ1912 
SEQ1913 
SEQ1914 

SEQ1901 
SEQ1902 
SEQ1903 
SEQ1904 
SEQ1905 
SEQ1906 
SEQ1907 
SEQ1908 
SEQ1909 
SEQ1910 
SEQ1911 
SEQ1912 
SEQX913 
SEQ1914 



TT -AAT T AAACTATAATT ATCT AATGGCCTCATTCCT-AAACTAGTAGCATCAAT 

TT-AATTAAACTATAATTATCT AATGGCCTCATTCCT -AAACTAGTAGCATCAAT 

TTTAAACAGAGTGTTATCCCTTTGCTAGATGATTTATCTCCTCAAGCTAAATTAGTGGGT 

TT-AATTAAACTATAATTATCT AATGGCCTCATTCCT -AAACTAGTAGCATCAAT 

TTTAAACAGAGTGTTATCCCTTTGCTARATGATTTATCTCCTCAAGCTAAATTAGTGGGT 

TT-AATTAAACTATAATTATCT AATGGCCTCATTCCT -AAACTAGTAGCATCAAT 

TTTAAACAGAGTGTTATCCCTTTGCTAGATGATTTATCTCCTCAAGCTAAATTAGTGGGT 

TT-AATTAAACTATAATTATCT AATGGCCTCATTCCT -AAACTAGTAGCATCAAT 

TTTAAACAGAGTGTTATCCCTTTGCTAGATGATTTATCTCCTCAAGCTAAATTAGTGGGT 

TT-AATTAAACTATAATTATCT AATGGCCTCATTCCT-AAACTAGTAGCATCAAT 

TTTAAACAGAGTGTTATCCCTTTGCTAGATGATTTATCTCCTCAAGCTAAATTAGTGGGT 

TT-AATTAAACTATAATTATCT AATGGCCTCATTCCT-AAACTAGTAGCATCAAT 

TTTAAACAGAGTGTTATCCCTTTGCTAGATGATTTATCTCCTCAAGCTAAATTAGTGGGT 
TT-AATTAAACTATAATTATCT AATGGCCTCATTCCT -AAACTAGTAGCATCAAT 

TAAAAATGACTAGTTCTAATAGCGTCTTTAAATGCTGTCTTATTTTCTAGATAATCAAC 
TAAAAATGACTAGTTCTAATAGCGTCTTTAAATGCTGTCTTATTTTCTAGATAATCAAC 
CTGTAAATACTATCGTTAATCAAGGTGGAACCGGACGTTTAGTAGGCCATATGACAGAT 
TAAAAATGACTAGTTCTAATAGCGTCTTTAAATGCTGTCTTATTTTCTAGATAATCAAC 
CTGTAAATACTATCGTTAATCAAGGTGGAACCGSACGTTTAGTAGGCCATATGACAGAT 
TAAAAATGACTAGTTCTAATAGCGTCTTTAAATGCTGTCTTATTTTCTAGATAATCAAC 

CTGTAAATACT— — — — 

TAAAAATGACTAGTTCTAATAGCGTCTT TAAATGCTGTCTTATTTT CTAGATAATCAAC 
CTGTAAATACTATCGTTAATCAAGGTGGAACCGGACGTTTAGTAGGCCATATGACAGAT 
TAAAAATGACTAGTTCTAATAGCGTCTTTAAATGCTGTCTTATTTTCTAGATAATCAAC 

CTGTAAATACTATCGTTAATCAAGGTGGAACC 

TAAAAATGACTAGTTCTAATAGCGTCTTTAAATGCTGTCTTATTTTCTAGATAATCAAC 
CTGTAAATACTATCGTTAATCAAGGTGGAACCGGACGTTTAGTAGGCCATATGACAGAT 
TAAAAATGACTAGTTCTAATAGCGTCTTTAAATGCTGTCTTATTTTCTAGATAATCAAC 

ACTACCTTTATTTGAAACTGTTTTTTAATTTTATCTGATAAGTCAATGACCTTATCGTA 
ACTACCTTTATTTGAAACTGTTTTTTAATTTTATCTGATAAGTCAATGACCTTATCGTA 

gcattggttgttttaaagctttagcagCtcaaggtttcagtgctaaaaataaaataatt 
actacctttatttgaaactgttttttaattttatctgataagtcaatgaccttatcgta 
gcattggttgttttaaagctttagcagctcaaggtttcagtgctaaaaataaaatagtt 
actacctttatttgarxactgttttttaattttatctgataagtcaatgaccttatcgta 

actacctttatttgaaactgttttttaattttatctgataagtcaatgaccttatcgta 
gcattggttgttttaaagctttagcagctcaaggtttcagtgctaaaaataaaatagtt 
actacctttatttgaaactgttttttaattttatctgataagtcaatgaccttatcgta 

actacctttatttgaaactgttttttaattttatctgataagtcaatgaccttatcgta 
gcattggttgttttaaagctttagcagctcaaggtttcagtgctaaaaataaaataatt 
actacctttatttgaaactgttttttaattttatctgataagtcaatgaccttatcgta 

TTTGAGCTGTTACGATTAAATAATCTAATTTCCGCAACTCCCTCCATAGCTGCTTGAAC 
TTTGAGCTGTTACGATTAAATAATCTAATTTCCGCAACTCCCTCCATAGCTGCTTGAAC 
CAATAGCTGGTATTGGTGGTTCAGGTAAAGCAGTTGCAGTTCAAGCAGCTATGGAGGGA 
TTTGAGCTGTTACGATTAAATAATCTAATTTCCGCAACTCCCTCCATAGCTGCTTGAAC 
CAATAGCTGGTATTGGTGGTTCAGGTAAAGCAGTTGCAGTTCAAGCAGCTATGGAGGGA 
TTTGAGCTGTTACGATTAAATAATCTAATTTCCGCAACTCCCTCCATAGCTGCTTGAAC 

TTTGAGCTATTACGATTAAATAATCTAATTTCCGCAACTCCCTCCATAACTGCTTGAAC 

CAATAGCTGGTATTGGTG 

T TTG AGCTGTTACGAT 

TTTGAGCTGTTACGATTAAATAATCTAATTTCCGCAAC 

CAATAGCTGGTATTGGTGGTTCAGGTAAAGCAGTTGCAGTTCAAGCAGCTATGGAGGGA 
TTTGAGCTATTACGATTAAATAATCTAATTTCCGCAACTCCCTCCATAGCTGCTTGAAC 
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Me 19: Comparative Sequences relating to SAG1680 (shikimate 



>5-aetaj 



hydrogenase) 



SEQ1901 

SEQ1902 

SEQ1903 

SEQ1904 

SEQ1905 

SEQ1906 

SEQ1907 

SEQ1908 

SEQ1909 

SEQ1910 

SEQ1911 

SEQ1912 

SEQ1913 

SEQ1914 

SEQ1901 

SEQ1902 

SEQ1903 

SEQ1904 

SEQ1905 

SEQ1906 

SEQ1907 

SEQ1908 

SEQ1909 

SEQ1910 

SEQ1911 

SEQ1912 

SEQ1913 

SEQ1914 

SEQ1901 
SEQ1902 
SEQ1903 
SEQ1904 
SEQ1905 
SEQ1906 
SEQ1907 
SEQ1908 
SEQ1909 
SEQ1910 
SEQ1911 
SEQ1912 
SEQ1913 
SEQ1914 

SEQ1901 
SEQ1902 
SEQ1903 
SEQ1904 
SEQ1905 
SEQ1906 
SEQ1907 
SEQ1908 
SEQ1909 
SEQ1910 
SEQ1911 
SEQ1912 
SEQ1913 
SEQ1914 



GCAACTGCTTTACCTGAACCACCAATACCAGCTATTGTAATTATTTTATTTTTAGCACT 
GCAACTGCTTTACCTGAACCACCAATACCAGCTATTGTAATTATTTTATTTTTAGCACT 
TTGCGGAAATTAGATTATTTAATCGTAACAGCTCAAATTACGATAAGGTCATTGACTTA 
GCAACTGCTTTACCTGAACCACCAAT ACCAGCT ATT GTAATTATTT TATTTT T AGCACT 
TTGCGGAAATTAGATTATTTAATCGTAATAGCTCAAATTACGATAAGGTCATTGACTTA 
GCAACTGCTTTACCTGAACCACCAATACCAGCTATTGTAATTATTTTATTTTTAGCACT 

GCAACTGCTTTACCTGAACCACCAATACCAGCTATTGTAACTATTTT 

TTGCGG 

GCAACTGCTT TACCTGAACCACCAAT ACCAGCT ATTGTAACTATTT TATTTTT AGCACT 

AAACCTTGAGCTGCTAAAGCTTTAAAACAACCAATGCCATCTGTCATATGGCCTACTAA 
AAACCTTGAGCTGCTAAAGCTTTAAAACAACCAATGCCATCTGTCATATGGCCTACTAA 
CAGATAAAATTAAAAAACAGTTTCAAATAAAGGTAGTCGTTGATTATCTAGAAAATAAG 
AAACCTTGAGCTGCTAAAGCTTTAAAACAACCAATGCCATCTGTCATATGGCCTACTAA 
CAGATAAAATTAAAAAACAGTTTCAAATAAAGGTAGTCGTTGATTATCTAGAAAATAAG 
AAACCTTGAGCTGCTAAAGCTTTAAAACAACCAATGCCATCTGTCATATGGCCTACTAA 



AAACCTTGAGCTGCTAAAGCTTTAAAACAACCAATGCCATCTGTCAT— TABCMARAT — 

CGTCCGGTTCCACCTTGATTAACGATAGTATTTACAGCACCCACTAATTTAGCTTGAGG 
CGTCCGGTTCCACCTTGATTAACGATAGTATTTACAGCACCCACTAATTTAGCTTGAGG 
CAGCATTTAAAGACGCTATTAGAACTAGTCATTTTTATATTGATGCTACTAGTTTAGGA 
CGTCCGGTTCCACCTTGATTAACGATAGTATTTACAGCACCCACTAATTTAGCTTGAGG 
CAGCATT TAAAGACGCT ATTAGAACTAGTCATTTT TATATTGATGCT ACT AGTT T AGGA 
CGTCCGGTTCCACCTTGATTAACGATAGTATTTACAGCACCCACTAATTTAGCTTGAGG 



STNCSRATNGTSASHKMATDRYDRGNAS 

GAT AAATCATCTAGCAAAGGGATAACACTCTGTT T AAAT GGCATTGAAACATTAACACC 
G ATAAATCATCT AGCAAAGGGATAACACTCTGTT TAAATGGCATTGAAACAT TAACACC 
TGAGGCCATT AGATAATT AT AGTTTAATTAACG ATCCAGATATTTTAACACCGAAT T T A 
GATAAATCATCTAGCAAAGGGATAACACTCTGTTTAAATGGCATTGAAACATTAACACC 
TGARGCCATT AGAT AATTAT AGT T TAATTAACG AT CC AGAAATT TTAACACCCAATTT A 
GATAAATCATCTAGCAAAGGGATAACACTCTGTTTAAATGGCATTGAAACATTAACACC 
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ble 19: 



1 

5-oeh\ 



SEQ1901 

SEQ1902 

SEQ1903 

SEQ1904 

SEQ1905 

SEQ1906 

SEQ1907 

SEQ1908 

SEQ1909 

SEQ1910 

SEQ1911 

SEQ1912 

SEQ1913 

SEQ1914 

SEQ1901 
SEQ1902 
SEQ1903 
SEQ1904 
SEQ1905 
SEQ1906 
SEQ1907 
SEQ1908 
SEQ1909 
• SEQ1910 
SEQ1911 
SEQ1912 
SEQ1913 
SEQ1914 

SEQ1901 

SEQ1902 

SEQ1903 

SEQ1904 

SEQ1905 

SEQ1906 

SEQ1907 

SEQ1908 

SEQ1909 

SEQ1910 

SEQ1911 

SEQ1912 

SEQ1913 

SEQ1914 



Ci MS £s a* & tiki 52 y ' 
C mparative Sequences relating t SAG1680 (shikimate S^ihydrogenase) 

CGAATACCCAATGCCCTGACACCTCGAACAGCTTCTGTTAATTTACCCTCTTCTACTTC 
CGAATACCCAATGCCCTGACACCTCGAACAGCTTCTGTTAATTTACCCTCTTCTACTTC 

TAGTTGTCGACTT ~ 

CGAATACTCAATGCCCTGACACCTCGAACAGCTTCTGTTAATTTACCCTCTTCTACTTC 

TAGTTGTCGACTTGGTTTACAAGCCTAAAGAAACAGCATTGTTACGATTTGTTAGACAA 

CGAATACCCAATGCCCTGACACCTCGAACAGCTTCTGTTAATTTACCCTCTTCTACTTC 

AATGTCAGAT AGGCATAATT CATGTTTTTT TCTT GAAAAGAGGTATTCCACATT AACGG 
AATGTCAGATAGGCATAATTCATGTTTTTTTCTTGAAAAGAGGTATTCCACATTAACGG 

AAT GTCAGAT AGGCATAATTCATGTTTTTTTCT T GAAAAGAGGTATTCCACATTAACGG 

ATGGAGTGAAACATGCTTATAATGGTCTAGGGATGCTGATTTATCAAGGAGCAGA 

AATGTCAGATAGGCATAATTCATGTTTTTTTCTTGAAAAGAGGTATTCCACATTAACGG 

GATAGAGAGTGGCGTGCAGG- 
GAT AG AGAGTGGCGT GCAGGA 

I 

GATAGAGAGTGGCGTGCA 

GATAG 



6 



&t O 13 £= ^ n II ^ 

Table 20: C mparative Sequences relating to SAG1723 (signal peptidase I) 

SEQ ID NO. 2001: SAG1723 FROM THE COHl GBS TOPE la STRAIN 

MCGJOTCGATATTGTAGTGGCT^ 

ATCAAATATAAAAATGACACCTTAACTATTAACAATAAAAAAACAGAAGAACCTTACCT 
TAAATTACAGGAAAAATATTCGTATAACCC^CTTTT^ 

GCGAATTTACTACTGTCGTGCCTAAAGGCC^CTACTATCTTGTTGGTGATGACCGAATTGTCTCT 
TTCAAAA 

SEQ ID NO. 2002: SAG1680 FROM THE CJB110 GBS NONTYPEABLE STRAIN (REVERSE 
COMPLEMENT) 

TAAAGTTGACGGACACTCCATGGATCCAACTTTAGCTGACAAGGAACAGCTAGTAGTTCT 

TTGTAGTGGCTAACGAAGAAGAAGGCGGCCAAAAGAAAAAAATTGTTAAACGTGTCATTGGTATGCCAGGTGATGTCATCAAATATA 

AATGACACCTTAACTATTAACAATAAAAAAACAGAAGAACCTTACCTCAAGGAATATACTAAATTATTTAAAAAGGATAAATTACAG^ 

AAAATATTCGTATAACCCACTTTTCCAAGACCTAGCACAAAGCTCTACCGCTTTCACTACTGACAGCAATGGCAGCAGCGAATTTACTA 

CTGTCGTGCCTAAAGGCCACTATTATCTTGTTGGTGATGACCGAATTGTCTCTA 

ACAATTGTGGGAG 

SEQ ID NO. 2003: SAG1680 FROM THE 18RS21 GBS TYPE II STRAIN 

ttSvcggacactccatggatccaactttagctgacaaggaacagctagtagttctcaaacaaacaaaaatcaatcgattcgatattgta 

GTGGCTAACGAAGAAGAAGGCGGCCAAAAGAAAAAAATTGTTAAACGTC^ 
CACCTTAACTATTAACAATAAAAAAACAGAAGAACCTTACCTCAAGGAATATACT 

ATTCGTAT AACC C ACTTTT CCAAGACCT AGC ACAAAGCTCTACC G CTTT C ACCACTG ACAGC AATGGCAGC AGCGAATTTACTACTGTC 
GTGCCTAAAGGCCACTACTATCTTGTTGGTGATGACCGAATTGTCTCTAAAGATAGTCGTGCCGTCGGTCCCTTCAAAAAATCAACGAT 

TGTGGGAGAGGT 

SEQ ID NO. 2004: SAG1680 FROM THE 2603 V/R GBS TYPE V STRAIN (REVERSE 
COMPLEMENT) 

AAGTTGACGGACACTCCATGGATCCAACTTTAGCTGACAAGGAACAG 

GTAGTGGCTAACGAAGAAGAAGGCGGCCAAAAGAAAAAAATTGTTAAACGTGTCATTGGTATGCCAGGTGATGTCATCAAATATAAAAA 
TGACACCTTAACTATTAACAATAAAAAAACAGAAGAACCTTACCTCAAGGAATATACTAAATTATTTAAAAAGGATAAATTACAGGAAA 
AATATTCGTATAACCCACTTTTCCAAGACCTAGCACAAAGCTCTACCGCTTTCACCACTGACAGCAATGGCAGCAGCGAATTTACTACT 
GTCGTGCCTAAAGGCCACTACTATCTTGTTGGTGATGACCGAATTGTCTCTAAAGATAGTCGTGCCGTCGGT 

SEQ ID NO. 2005: SAG1680 FROM THE M732 GBS TYPE III STRAIN (REVERSE COMPLEMENT) 

TTGACGGACACTCCATGGATCCAACTTTAGCTGACAAGGAACAGCTAGTAGTTCTCAAACAAACAAAATAATCGATTCGATATTGTAGT 

GGCTAACGAAG AAGAAGGCGGCCAAAAGAAAAAAATT GTT AAACGTGTC ATTGGT ATGCCAGGT GATGT CATCAAAT AT AAAAATGACA 
CCTTAACTATTAACAATAAAAAAACAGAAGAA^ 

TCGTATAACCCACTTTTCCAAGACCTAGCACAAAGCTCTACCGCTTTCACCACTGACAGCAATGGCAGCAGCGAATTTACT 
GCCTAAAGGCCACTACTATCTTGTTGGTGATGACCGA 

SEQ ID NO. 2006: SAG1680 FROM THE M781 GBS TYPE III STRAIN 

TTGACGGACACTCCATGGATCCAACTTTAGCTGACAAGGAACAGCTAGT 

GTGGCTAACGAAGAAGAAGGCGGCCAAAAGAAAAAAATTGTT^ 

CACCTTAACTATTAACAATAAAAAAACAGAAGAACCTTACCTCAAGGAATATACTAAAT 

TATTCGTATAACCCACTTTTCCAAGACCTAGC^^ 

SEQ ID NO. 2007: SAG1680 FROM THE 1169NT1 GBS TYPE V STRAIN (REVERSE 
COMPLEMENT) 

TTGGTAAAGTTGACGGACACTCCATGGATCCAACTTTAGCTGACAAGGAACAGCTAGTAGTTCTCAAACAAACAAAAATW\ATCGATTC 
GATATTGTAGTGGCTAACGAAGAAGAAGGCGGCCAAAAGAAAAAAATTGTTAAACGTGTCATTGGTATGCCAGGTGATGTCATCAAATA 
TAAAAATGACACCTTAACTATTAACAATAAAAAAACAGAAGAACCTTACCTCAAGGAATATACTAAATTATTTAA 

AGGAAAAATATTCGTATAACCCACTTTTCC^GACCTAGCACAAAGCTCTACCGCTTTCACTACTGACAGCAATGGC^GCAGCGAATOT 

ACCACTGTCGTGCCTAAAGGCC^CTACTA^ 
ATCAACG 

SEQ ID NO. 2008: SAG1680 FROM THE H36b GBS TYPE lb STRAIN (REVERSE COMPLEMENT) 

TTGACGGACACTCCATGGATCCAACTTTAGCTGACAAGGAACAGCTAGTAGTTCTCAAACAAACAAAAATCAATCGATTCGATATTGTA 

GTGGCTAACGAAGAAGAAGGCGGCCAAAAGAAAAAAATTGTTAAACGTGTCATTGGTATGCCAGGTGATGTCATCAAATATAAAAATGA 

CACCTTAACTATTAACAATAAAAAAACAGAAGAACCTTACCTCAAGGAATATACTAAATTATTTAAAAAGGATAA 

ATTCGTATAACCCACTTTTCCAAGACCTAGCACAAAGCTCTACTCCTTTCAC 

GTGCCTAAAGGCCACTACTATCTTGTTGGTGATGACCGA 

SEQ ID NO. 2009: SAG1680 FROM THE 090 GBS TYPE la STRAIN (REVERSE COMPLEMENT) 

TAAAGTTGACGGACACTCCATGGATCCAACTTTAGCTGACAAGGAACAGCTAGTAGTTCTCAAACAAACAAAAATCAATCGAT 

OTGTAGTGGCTAACGAAGAAGAAGGCGGCCAAAAGAAAAAAATTGTTAAACGTGTCATT 
AATGAC^CCTTAACTATTAACAATAAAAAAACAGAAGAACCTTACCTCAAGGAATATACTAA^ 
AAAATATTCGTATAACCCACTTTTCCAAGACCTAGCACAAAGCTCTACCGCTTTCACTACTGACAGCAATGGCAGC 
CTGTCGTGCCTAAAGGCCACTATTATCTTGTTGGTGATGACCGAATTGTCTCTAAAGATAGTCGTGCCGTCGGT 
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Table 20: C mparative Sequences relating to SAG1723 (signal peptidase I) 

SBQ ID NO. 2010: SAG1680 FROM THE A909 GBS TYPE la STRAIN (REVERSE COMPLEMENT) 

AAAGTTGACGGACACTCCATGGATCCAACTTTAGCT 

TGTAGTGGCTAACGAAGAAGAAGGCGGCCAAAAGAAAAAAATTGTTAAACGTGTCATTGGTATGCCAGGTGATGTCATCAAATATAAAA 

ATGACACCTTAACTATTAACAATAAAAAAACAGAAGAACCTTACCTCAAGGAATATACTAAATTATTTAAAAAGGATAAAOT 

AAATATTCGTATAACCCACTTTTCCAAGACCTAGCACAAAGCTCTACCGCTTTCACCACTGACAGCAATGGCAGCAGCGAATTTACTAC 

^TCGTGCCTAAAGGCCACTACTATCTTGTT^ 



TGT 
CG 



SEQ2001 
SEQ2002 
SEQ2003 
SEQ2004 
SEQ2005 
SEQ2006 
SEQ2007 
SEQ2008 
SEQ2009 
SEQ2010 

SEQ2001 
SEQ2002 
SEQ2003 
SEQ2004 
SEQ2005 
SEQ2006 
SEQ2007 
SEQ2008 
SEQ2009 
SEQ2010 

SEQ2001 
SEQ2002 
SEQ2003 
SEQ2004 
SEQ2005 
SEQ2006 
SEQ2007 
SEQ2008 
SEQ2009 
SEQ2010 

SEQ2001 
SEQ2002 
SEQ2003 
SEQ2004 
SEQ2005 
SEQ2006 
SEQ2007 
SEQ2008 
SEQ2009 
SEQ2010 

SEQ200X 
SEQ2002 
SEQ2003 
SEQ2004 
SEQ2005 
SEQ2006 
SEQ2007 
SEQ2008 
SEQ2009 
SEQ2010 



TAAAGTTGACGGACACTCCATGGATCCAACTTTAGCTGACAAGGAACAGCTAGTAG 

TTGACGGACACTCCATGGATCCAACTTTAGCTGACAAGGAACAGCTAGTAG 

T^GTTGACGGACACTCCATGGATCCAACTTTAGCTGACAAGGAACAGCTAGTAG 

TT GACGGACACTCCATGGAT CCAACTT TAGCTGACAAGGAACAGCTAGT AG 

TTGACGGACACTCCATGGATCCAACTTTAGCTGACAAGGAACAGCTAGTAG 

TGGTAAAGTTGACGGACACTCCATGGATCCAACTTTAGCTGACAAGGAACAGCTAGTAG 

:_TTGACGGACACTCCATGGATCCAACTTTAGCTGACAAGGAACAGCTAGTAG 

TAAAGTTGACGGACACTCCATGGATCCAACTTTAGCTGACAAGGAACAGCTAGTAG 

AAAGTTGACGGACACTCCATGGATCCAACTTTAGCTGACAAGGAACAGCTAGTAG 

ATCGATTCGATATTGTAGTGGCTAACGAAGAAGAAGGCG 

TCTCAAACAAACAAAAATCAATCGATTCGATATTGTAGTGGCTAACGAAGAAGAAGGCG 
TCTCAAACAAACAAAAATCAATCGATTCGATATTGTAGTGGCTAACGAAGAAGAAGGCG 
TCTCAAACAAACAAAAATCAATCGATTCGATATTGTAGTGGCTAACGAAGAAGAAGGCG 
TCTCAAACAAACAAAA — TAATCGATTCGATATTGTAGTGGCTAACGAAGAAGAAGGCG 
TCTCAAACAAACAAAAATCAATCGATTCGATATTGTAGTGGCTAACGAAGAAGAAGGCG 
TCTCAAACAAACAAAAATCAATCGATTCGATATTGTAGTGGCTAACGAAGAAGAAGGCG 
TCTCAAACAAACAAAAATCAATCGATTCGATATTGTAGTGGCTAACGAAGAAGAAGGCG 
TCTCAAACAAACAAAAATCAATCGAT TCGATATTGTAGT GGCT AACGAAGAAGAAGGCG 
TCTCAAACAAACAAAAATCAATCGATTCGATATTGTAGTGGCTAACGAAGAAGAAGGCG 

GCCAAAAGAAAAAAATTGTTAAACGTGTCATTGGTATGCCAGGTGATGTCATCAAATATA 
GCCAAAAGAAAAAAATTGTTAAACGTGTCATTGGTATGCCAGGTGATGTCATCAAATATA 
GCCAAAAGAAAAAAATTGTTAAACGTGTCATTGGTATGCCAGGTGATGTCATCAAATATA 
GCCAAAAGAAAAAAATTGTTAAACGTGTCATTGGTATGCCAGGTGATGTCATCAAATATA 
GCCAAAAGAAAAAAATTGTTAAACGTGTCATTGGTATGCCAGGTGATGTCATCAAATATA 
GCCAAAAGAAAAAAATTGTTAAACGTGTCATTGGTATGCCAGGTGATGTCATCAAATATA 
GCCAAAAGAAAAAAATTGTTAAACGTGTCATTGGTATGCCAGGTGATGTCATCAAATATA 
GCCAAAAGAAAAAAATTGTTAAACGTGTCATTGGTATGCCAGGTGATGTCATCAAATATA 
GCCAAAAGAAAAAAATTGTTAAACGTGTCATTGGT ATGCCAGGTGATGT CAT CAAAT ATA 
GCCAAAAGAAAAAAATTGTTAAACGTGTCATTGGTATGCCAGGTGATGTCATCAAATATA 

AAAATGACACCTTAACTATTAACAATAAAAA7\ACAGAAGAACCTTACCTCAAGGAATATA 
AAAATGACACCTTAACTATTAACAATAAAAAAACAGAAGAACCTTACCTCAAGGAATATA 
AAAATGACACCTTAACTATTAACAATAAAAAAACAGAAGAACCTTACCTCAAGGAATATA 
AAAATGACACCTTAACTATTAACAATAAAAAAACAGAAGAACCTTACCTCAAGGAATATA 
AAAATGACACCTTAACTATTAACAAT7VAAAAAACAGAAGAACCTTACCTCAAGGAATATA 
AAAATGACAC CT T AACT ATTAACAAT AAAAAAACAGAAGAACCTTACCTC AAGGAAT ATA 
AAAATGACACCTTAACTATTAACAATAAAAAAACAGAAGAACCTTACCTCAAGGAATATA 
AAAATGACACCTTAACTATTAACAATAAAAAAACAGAAGAACCTTACCTCAAGGAATATA 
AAAATGACACCTTAACTATTAACAATAAAAAAACAGAAGAACCTTACCTCAAGGAATATA 
AAAATGACACCTTAACTATTAACAATAAAAAAACAGAAGAACCTTACCTCAAGGAATATA 

CTAAATTATTT - AAAAAGGATAAATTACAGGAAAAATAT T CGT ATAACCCACT TT TCCAA 
CTAAATT AT TT -AAAAAGGATAAATTACAGGAAAAATATTCGT AT AACCCACTT T TCCAA 
CTAAATT AT TT - AAAAAGGATAAATTACAGGAAAAATATTCGT ATAACCCACT T T TCCAA 
CTAAATTAT TT-AAAAAGGATAAATTACAGGAAAAATATTCGT AT AACCCACTT T TCCAA 
CTAAATTATTT-AAAAAGGATAAATTACAGGAAAAATATTCGTATAACCCACTTTTCCAA 
CTAAATTATTTTAAAAAGGATAAATTACAGGAAAAATATTCGTATAACCCACTTTTCCAA 
CTAAATT AT TT-AAAAAGGATAAATTACAGGAAAAATATTCGT AT AACCCACTTTTCCAA 
CTAAATTATTT-AAAAAGGATAAATTACAGGAAAAATATTCGTATAACCCACTTTTCCAA 
CTAAAT T ATTT - AAAAAGGATAAATT ACAGGAAAAATATTCGT ATAACCC ACTTTTCCAA 
CTAAATTATTT-AAAAAGGATAAATTACAGGAAAAATATTCGTATAACCCACTTTTCCAA 
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Table 20: C mparative Sequences relating to SAG1723 (signal peptidase I) 



SEQ2001 
SEQ2002 
SEQ2003 
SEQ2004 
SEQ2005 
SEQ2006 
SEQ2007 
SEQ2008 
SEQ2009 
SEQ2010 

SEQ2001 
SEQ2002 
SEQ2003 
SEQ2004 
SEQ2005 
SEQ2006 
SEQ2007 
SEQ2008 
SEQ2009 
SEQ2010 

SEQ2001 
SEQ2002 
SEQ2003 
SEQ2004 
SEQ2005 
SEQ2006 
SEQ2007 
SEQ2008 
SEQ2009 
SEQ2010 

SEQ2001 
SEQ2002 
SEQ2003 
SEQ2004 
SEQ2005 
SEQ2006 
SEQ2007 
SEQ2008 
SEQ2009 
SEQ2010 



GACCTAGCACAAAGCTCTACCGCTTTCACCACTGACAGCAATGGCAGCAGCGAATTTACT 
GACCTAGCACAAAGCTCTACCGCTTTCACTACTGACAGCAATGGCAGCAGCGAATTTACT 
GACCTAGCACAAAGCTCTACCGCTTTCACCACTGACAGCAATGGCAGCAGCGAATTTACT 
GACCTAGCACAAAGCTCTACCGCTTTCACCACTGACAGCAATGGCAGCAGCGAATTTACT 
GACCTAGCACAAAGCTCTACCGCTTTCACCACTGACAGCAATGGCAGCAGCGAATTTACT 
GACCTAGCACAAAGCTCTACCGCTTTCACCACTGACAGCAATGGCAGCAGCGAATTTACT 
GACCTAGCACAAAGCTCTACCGCTTTCACTACTGACAGCAATGGCAGCAGCGAATTTACC 
GACCTAGCACAAAGCTCTACCGCTTTCACCACTGACAGCAATGGCAGCAGCGAATTTACT 
GACCTAGCACAAAGCTCTACCGCTTTCACTACTGACAGCAATGGCAGCAGCGAATTTACT 
GACCTAGCACAAAGCTCTACCGCTTTCACCACTGACAGCAATGGCAGCAGCGAATTTACT 

CTGTCGTGCCTAAAGGCCACTACTATCTTGTTGGTGATGACCGAATTGTCTCTAAAGAT 
CTGTCGTGCCTAAAGGCCACTATTATCTTGTTGGTGATGACCGAATTGTCTCTAAAGAT 
CTGTCGTGCCTAAAGGCCACTACTATCTTGTTGGTGATGACCGAATTGTCTCTAAAGAT 
CTGTCGTGCCTAAAGGCCACTACTATCTTGTTGGTGATGACCGAATTGTCTCTAAAGAT 
CTGTCGTGCCTAAAGGCCACTACTATCTTGTTGGTGATGACCGA ~ 

CTGTCGTGCCTAAAGGCCACTACTATCTTGTTGGTGATGACCG7VATTGTCTCTAAAGAT 

CTGTCGTGCCTAAAGGCCACTACTATCTTGTTGGTGATGACCGA 

CTGTCGTGCCTAAAGGCCACTATTATCTTGTTGGTGATGACCGAATTGTCTCTAAAGAT 
CTGTCGTGCCTAAAGGCCACTACTATCTTGTTGGTGATGACCGAATTGTCTCTAAAGAT 

GTCGTGCCGTCGGTTCCTTCAAAA 

GTCGTGCCGTCGGTCCCTTCAAAAAATCAACAATTGTGGGAG 

GTCGTGCCGTCGGTCCCTTCAAAAAATCAACGATTGTGGGAGAGGT 

GTCGTGCCGTCGGT " 

GTCGTGCCGTCGGCCCCTTCAAAAAATCAACG 

GTCGTGCCGTCGGT 

GTCGTGCCGTCGGTCCCTTCAAAAAATCAACGTABCMARATVSTNCSRATNGTSAGSGN 



TDAS 
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Tg^21: Comparative Sequences relating to SAG0079 (adenylate kinase) 

SES^B NO 2101: SAG0079 FROM THE 2603V/R GBS TYPE V STRAIN 

AATCTTTTAATTATGGGTTTGCCTGGTGCTGGTAAAGGTACT 

ACAGGGGATATGTTCCGCGCCGCAATGGCTAATCAAACCGAAATGGGACGTTTAGCTAAAAGTTATATTGATAAAGGTGAATTGGTT 
CCTGATGAAGTAACAAACGGGATTGTAAAAGAGCGCTTAGCTGAGGATGATATCGCAGAAAAAGGTTTTTTACTTGATGGATATCCA 
CGTACTATTGAACAAGCACACGCCTTAGATGCTACGCTTGAAGAACTAGGACTACGCTTAGATGGTGTTATTAATATTAAAGTGGAT 
C(^TCATGTCTTATAGAGCGTTTGAGTGGTCGTATTATCAATCGTAAAACTGGTGAAACTTTCCACAAAGTGTTCAACCCACCAGT^ 
GATTATAAAGAAGAAGATTACTATCAACGTGAAGATGATAAGCCTGAAACTGTCAZ^ACGTCGCTTGGACGTTAATATTGCTCAAGGA 
GAACCTATTCTTGAACACTATCGTAAGCTTGGTCTTGTTACAGATATTGAAGGTAATCAAGAAATAACAG7\AGTTTTTGCAGATGTT 

GAAAAAGCGTTG 

SEQ m NO. 2102: SAG0079 FROM THE 090 GBS TYPE la STRAIN (REVERSE COMPLEMENT) 

AATCTTTTAATTATGGGTTTGCCTGGTGCTGGTAAAGGTACTCAAGCAGCTAAGATCGTTGAAGAATTTGGTGTTGCTCACATCTCA 
ACAGGGGATATGTTCCGCGCCGCAATGGCTAATCAAACCGAAATGGGACGTTTAGCTAAAAGTTATATTGATAAAGGTGAATTGGTT 
CCTGATGAAGTAACAAACGGGATTGTAAAAGAGCGCTTAGCTGAGGATGATATCCCAGAAAAAGGTTTTTTACTTGATGGATATCCA 
CGTACTATTGAACAAGCACACGCCTTAGATGCTACGCTTGAAGAACTAGGACTACGCTTAGATGGTGTTATTAATATTAAAGTGGAT 
CCATCATGTCTTATAGAGCGTTTGAGTGGTCGTATTATCAATCGTAAAACTGGTGAAACTTTCCACT^AAGTGTTCAACCCACCAGTA 
GATTATAAAGAAGAAGATTACTATCAACGTGAAGATGATAAGCCTGAAACTGTCAAACGTCGCTTGGACGTTAATATTGCTCAAGGA 
GAACCTATTCTTGAACACTATCGTAAGCTTGGTCTTGTTACAGATATTGAAGGTAATCAAGAAATAACAGAAGTTTTTGCAGATGTT 

GAAAAAGCGTTGCTAGAACTCAAA 

SEQ ID NO 2103: SAGO 07 9 FROM THE 1169NT1 GBS TYPE V STRAIN (REVERSE COMPLEMENT) 

TGGTAAAGGGACTCAAGCAGCTAAGATTGTTGAAGAATTTGGTGTTGCGCACATCTCAACAGGGGATATGTTCCGCGCCGCAATGGC 
TAATCAAACCGAAATGGGACGTTTAGCTAAAAGTTATATTGATAAAGGTGAATTGGTTCCTGATCAAGTAACAAACGGGATTGTAAA 
AGAGCGCTTAGCTGAGGATGATATCGCAGAAAAAGGTTTTTTACTTGATGGGTATCCACGTACTATTGAACAAGCACACGCCTTAGA 
TGCTACGCTTGAAGAACTAGGACTACGCTTAGATGGTGTTATTAATATTAAAGTGGATCCATCATGTCTTATAGAGCGTTTGAGTGG 
TCGTATTATCAATCGTAT^AACTGGTGAAACTTTCCACAAAGTGTTCAACCCACCAGTAGATTATAAAGAAGAAGATTACTATCAACG 
TGAAGATGATAAGCCTGAAACTGTCAAACGTCGCTTGGACGTTCATATTGCTCAAGGAGAACCTATTCTTGAACACTATAGTAAGCT 

TGGCCTTGTTACAGATATTGAAGGTAATCAAGAAATAA 

SEQ ID NO. 2104: SAGO 07 9 FROM THE 18RS21 GBS TYPE II STRAIN (REVERSE COMPLEMENT) 

AATCTTTTAACCACGGGTTCGCCTGGTGCTGGTAAAGGTACTCAAGCAGCTAAGATCGTTGAAGAATTTGGTGTTGCTCACATCTCA 
ACAGGGGATATGTTCCGCGCCGCAATGGCTAATCAAACCGAAATGGGACGTTTAGCTAAAAGTTATATTGATAAAGGTGAATTGGTT 

cctgatgaagtaacaaacgggattgtaaaagagcgcttagctgaggatgatatcgcagaaaaaggttttt™^ 

CGTACTATTGAACAAGCACACGCCTTAGATGCTACGCTTGAAGAACTAGGACTACGCTTAGATGGTGTTATTAATATTAAAGTGGAT 
CCATCATGTCTTATAGAGCGTTTGAGTGGTCGTATTATCAATCGTAAAACTGGTGAAACTTTCCACAAAGTGTTCAACCCACCAGTA 
GATTATAAAGAAGAAGATTACTATCAACGTGAAGATGATAAGCCTGAAACTGTCAAACGTCGCTTGGACGTTAATATTGCTC7VAGGA 
GAACCTATTCTTGAACACTATCGTAAGCTTGGTCTTGTTACAGATATTGAAGGTAATCAAGAAATAACAGAAGTTTTTGCAGATGTT 

GAAAAAGCGTTGCTAGAA 

SEQ ID NO. 2105: SAG0079 FROM THE 2603V/R GBS TYPE V STRAIN (REVERSE COMPLEMENT ) 

AATCTTTTAATTATGGGTTTGCCTGGTGCTGGTAAAGGTACTCAAGCAGCTAAGATCGTTGAAGAATTTGGTGTTGCTCACATCTCA 

ACAGGGGATATGTTCCGCGCCGCAATGGCTAATCAAACCGAAATGGGACGTTTAGCTAAAAGTTATATTGATAAAGGTGAATTGGTT 

CCTGATGAAGTAACAAACGGGATTGTAAAAGAGCGCTTAGCTGAGGATGATATCGCAGAAAAAGGTTTTTTACTTGATGGATATCCA 

CGTACTATTGAACAAGCACACGCCTTAGATGCTACGCTTGAAGAACTAGGACTACGCTTAGATGGTGTTATTAATATTAAAGTGGAT 

CCATCATGTCTTATAGAGCGTTTGAGTGGTCGTATTATCAATCGTAAAACTGGTGAAACTTTCCACAAAGTGTTCAACCCACCAGTA 

GATTATAAAGAAGAAGATTACTATCAACGTGAAGATGATAAGCCTGAAACTGTCAAACGTCGCTTGGACGTTAATATTGCTCAAGGA 

GAACCTATTCI^GAACACTATCGTAAGCTTGGTCTTGTTACAGATATTGAAGGTAATCAAGAAATAAC 

GAAAAAGCGTTG 

SEQ ID NO. 2106: SAGO 07 9 FROM THE A909 GBS TYPE la STRAIN (REVERSE COMPLEMENT) 

AATCTTTTAATTATGGGTTTGCCTGGTGCTGGTAAAGGTACTCAAGC^GCTAAGATCGTTGAAGAATTTGGTGTTGCT 

ACAGGGGATATGTTCCGCGCCGCAATGGCTAATCAAACCGAAATGGGACGTTTAGCTAAAAGTTATATTGATAAAGGTGAATTGGTT 

CCTGATGAAGTAACAAACGGGATTGTAAAAGAGCGCTTAGCTGAGGATGATATCGCAGAAAAAG 

CGTACTATTGAACAAGCACACGCCTTAGATGCTACGCTTGAAGAACTAGGACTACGCTTAGATGGTGTTATTAATATTAAAGTGGAT 
CCATCATGTCTTATAGAGCGTTTGAGTGGTCGTATTATCAATCGTAAAACTGGTGAAACTTTCCACAAAGTGTTCAACCCACCAGTA 
GATT ATAAAGAAG AAGATT ACTAT CAACGTGAAGATG ATAAGCCTGAAACTGT C AAACGTCGCTTGGACGTTAAT ATT GCT C AAGGA 
GAATCTATTCTTGAACACTATCGAAAGCTTGGTCTTGTTACAGATATTGAAGGTAA 

SEQ ID NO. 2107: SAGO 07 9 FROM THE CJBllO GBS NONTYPEABLE STRAIN (REVERSE COMPLEMENT) 

AATCTTTTAACCACGGGTTTGCTTGGTGCTGGTAAAGGTACTCAAGCAGCTAAGATCGTTGAAGAATTTGGTGTTGCTCACATCTCA 
AC^GGGGATATGTTCCGCGCCGCAATGGCTAATCAAACCGAAATGGGACGTTTAGCTAA2UVGTTATATTGATAAAGGTGAATTGGTT 
CCTGATGAAGTAACAAACGGGATTGTAAAAGAGCGCTTAGCTGAGGATGATATCGC^GAAAAA 

CGTACTATTGAAC^U^GCACACGCCTTAGATGCTACGCTTGAAGAACTAGGACTACGCTTAGATGGTGTTATTAATATTAAAGTGGAT 

CCATCATGTCTTATAGAGCGTTTGAGTGGTCGTATTATCAATCGTAAAACTGGTGAAACTTTC 

GATTATAAAGAAGAAGATTACTATCAACGTGAAGATGATAAGCCTGAAACTGTCA^ 

GAACCTATTCTTGAACACTATAG 
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Ta^^l: Comparative Sequences relating to SAG0079 (adenylate kinase) 

AAAAAGCGTTGCTAG 

SEQ ID NO. 2W9-. SAG0079 FROM THE H36b GBS TRYP lb STRAY* _ (REVERSE »M«n 

CAGGGGATATGTTCCG< 
CTGATGAAGTAACAAA( 
GTACTATTGAACAAGCi 
CATCATGTCTTATAGA< 

AAAAAGCGTTGCT 

SEC ID MO. 2110: SAGO 07 9 FROM THE JM9130013 GBS TYPE VIII STRAIN (REVERSE 

GARCCTATTCTTGAACACTATAARAAGCTTGGTCTTGTTACAGATATTGAAGGTAATCA 



CAGGGGATATGTTCCGCGCCGCAATGGCTAATCAAACCGAAATGGGACGTTTAGCT^ 

ctgatgaagtaacaaacgggattg1 
gtactattgaacaagcacacgcct1 
catcatgtcttatagagcgttt^^^^ 



„„„„„ „_,... ^0079 FROM THE M732 GBS TYPE III STRAIN (REVERSE COMPLEMENT) 

Stttaat?atgggtttgc^ 



GCTAAAAGTTATATTGATAAAGGTGAATTGGTTCCT 
AAAGCGTTGCTAGAACTCAAA 

SEQ2103 - - * 



SEQ2104 atcttttarccacgggttcgcctggtgctggtaarggtactcaagcagctargatcot 

SEQ2105 ATCTTTTZ 

SE0.2106 ATCTTTTARCCACGGGTTTGCTTGGTGCTGGTA 



SEQ2105 ATCTTTTAATTRTGGGTTTGCCTGGTGCTGGTAAAGGTACTCARGCAGCTARGAT^ 



CGGGTTTGCCTGGTGCTGGTRAAGGTACTCAAGCAGCTAAGATCGTT 

RTCTTTTAATTATGGGTTTGCCTGGTGCTGGTAARGGTACTCAAGCAGCTAAGATCCT 
--CTTTTAATTATGGGTTTGCCTGGTGCTGGTAAAGGTACTCAAGCAGCTAAGATTGTT 
ATCTTTTAATTACGGGTTTGCCTGGTGCTGGTAAAGGTACTCAAGCAGCTAAGATTGTT 

AAGAATTTGGTGTTGCTCACATCTCAACAGGGGATATGTTCCGCGCCGCAATGGCTAAT 
AAGAATTTGGTGTTGCTCACATCTCAACAGGGGATATGTTCCGCGCCGCAATGGCTAAT 



SEQ2107 
SEQ2108 
SEQ2109 
SEQZ110 
SEQ2111 
SEQ2112 

SEQ2101 

SeSw AAGAATTTGGTGTTGCGCACM 

SEQ2104 AAGAATTTGGTGTTGCTCACATCTCAACAGGGGATATGTTCCGCGCCGCAA^ 

SEQ2105 
8EQ2106 



AAGAATTTGGTGTTGCTCACATCTCAACAGGGGATATGTTCCGCGCCGCAATGGCTAAT 
AAGAATTTGGTGTTGCTCACATCTCAACAGGGGATATGTTCCGCGCCGCAATGGCTAAT 
SEO2107 AAGAATTTGGTGTTGCTCACATCTCAACAGGGGATATGTTCCGCGCCGCAATGGCTAAT 
SqSob ^GAATTTGGTGTTGCTCACATCTCAACAGGGGATATGTTCCGCGCCGCAATGGCTAAT 
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SEQ2110 



SEQ2XXX 
SEQ2XX2 

SEQ2X0X 
SEQ£102 
SEQ2103 
SEQ2104 
SEQ2X05 
SEQ2X06 
SEQ2107 
SEQ2108 
SEQ2109 
SEQ2110 
SEQ2111 
SEQ2112 

SEQ2X0X 

SEQ2102 

SEQ2103 

SEQ2104 

SEQ2105 

SEQ2X06 

SEQ2107 

SEQ2108 

SEQ2109 

SEQ2110 

SEQ2X11 

SEQ2112 

SEQ2101 
SEQ2102 
SSQ2103 
SEQ2104 
SEQ2105 
SEQ2106 
SEQ2107 
SEQ2108 
SEQ2109 
SEQ2110 
SEQ2111 
SEQ2112 

SEQ2101 
SEQ2X02 
SEQ2103 
SEQ2104 
SEQ2105 
SEQ2106 
SEQ2107 
SEQ2108 
SEQ2109 
SEQ2XX0 
SEQ2111 
SEQ2X12 

SEQ2X0X 
SEQ2X02 
SEQ2103 
SEQ2X04 
SEQ2X05 
SEQ2X06 
SEQ2X07 
SEQ2X08 
SEQ2X09 
SEQ2110 
SEQ2111 
SEQ2XX2 



21: Comparative Sequences relating to SAG0079 (adenylate kinase) 



CAGGGGATATGTTCCGCGCCGCAATGGCTAAT 

MGAATTTGGTGTTGCTCACATCTCAACAGGGGATATGTTCCGCGCCGCAATGGCTAAT 

AAGAATTTGGTGTTGCTCACATCTCAACAGGGGATATGTTCCGCGCCGCAATGGCTAAT 
AAGAATTTGGTGTTGCTCACATCTCAACAGGGGATATGTTCCGCGCCGCAATGGCTAAT 

CAAACCGAAATGGGACGTTTAGCTAAAAGTTATATTGATAAAGGTGAATTGGTTCCTGAT 

CAAACCGAAATGGGACGTTTAGCTAAAAGTTATATTGATAAAGGTGAATTGGTTCCTGAT 

CAAACCGAAATGGGACGTTTAGCTAAAAGTTATATTGATAAAGGTGAATTGGTTCCTGAT 

CAAACCGAAATGGGACGTTTAGCTAAAAGTTATATTGATAAAGGTGAATTGGTTCCTGAT 

CAAACCGAAATGGGACGTTTAGCTAAAAGTTATATTGATAAAGGTGAATTGGTTCCTGAT 

CAAACCGAAATGGGACGTTTAGCTAAAAGTTATATTGATAAAGGTGAATTGGTTCCTGAT 

CAAACCGAAATGGGACGTTTAGCTAAAAGTTATATTGATAAAGGTGAATTGGTTCCTGAT 

CAAACCCAAATGGGACGTTTAGCTAAAAGTTATATTGATAAAGGTGAATTGGTTCCTGAT 

CAAACCGAAATGGGACGTTTAGCTAAAAGTTATATTGATAAAGGTGAATTGGTTCCTGAT 

CAAACCGAAATGGGACGTTTAGCTAAAAGTTATATTGATAAAGGTGAATTGGTTCCTGAT 

CAAACCCAAATGGGACGTTTAGCTAAAAGTTATATTGATAAAGGTGAATTGGTTCCTGAT 

CAAACCCAAATGGGACGTTTAGCTAAAAGTTATATTGATAAAGGTGAATTGGTTCCTGAT 

AAGTAACAAACGGGATTGTAAAAGAGCGCTTAGCTGAGGATGATATCGCAGAAAAAGGT 

AAGTAACAAACGGGATTGTAAAAGAGCGCTTAGCTGAGGATGATATCGCAGAAAAAGGT 

AAGTAACAAACGGGATTGTAAAAGAGCGCTTAGCTGAGGATGATATCGCAGA7^AAAGGT 

AAGTAACAAACGGGATTGTAAAAGAGCGCTTAGCTGAGGATGATATCGCAGAAAAAGGT 

AAGTAACAAACGGGATTGTAAAAGAGCGCTTAGCTGAGGATGATATCGCAGAAAAAGGT 

AAGTAACAAACGGGATTGTAAAAGAGCGCTTAGCTGAGGATGATATCGCAGAAAAAGGT 

AAGTAACAAACGGGATTGTAAAAGAGCGCTTAGCTGAGGATGATATCGCAGAAAAAGGT 

AAGTAACAAACGGGATTGTAAAAGAGCGCTTAGCTGAGGATGATATCGCAGAAAAAGGT 

AAGTAACAAACGGGATTGTAAAAGAGCGCTTAGCTGAGGATGATATCGCAGAAAAAGGT 

AAGTAACAZVACGGGATTGTAAAAGAGCGCTTAGCTGAGGATGATATCGCAGAAAAAGGT 

AAGTAACAAACGGGATTGTAAAAGAGCGCTTAGCTGAGGATGATATCGCAGAAAAAGGT 

AAGTAACAAACGGGATTGTAAAAGAGCGCTTAGCTGAGGATGATATCGCAGAAAAAGGT 

TTTTTACTTGATGGATATCCACGTACTATTGAACAAGCACACGCCTTAGATGCTACGCTT 
TTTTTACTTGATGGATATCCACGTACTATTGAACAAGCACACGCCTTAGATGCTACGCTT 
TTTTTACTTGATGGGTATCCACGTACTATTGAACAAGCACACGCCTTAGATGCTACGCTT 
TTTTTACTTGATGGATATCCACGTACTATTGAACAAGCACACGCCTTAGATGCTACGCTT 
TTTTTACTTGATGGATATCCACGTACTATTGAACAAGCACACGCCTTAGATGCTACGCTT 
TTTTTACTTGATGGATATCCACGTACTATTGAACAAGCACACGCCTTAGATGCTACGCTT 
TTTTTACTTGATGGATATCCACGTACTATTGAACAAGCACACGCCTTAGATGCTACGCTT 
TTTTTACTTGATGGATATCCACGTACTATTGAGCAAGCACACGCCTTAGATGCTACGCTT 
TTTTTACTTGATGGATATCCACGTACTATTGAACAAGCACACGCCTTAGATGCTACGCTT 
TTTTTACTTGATGGATATCCACGTACTATTGAACAAGCACACGCCTTAGATGCTACGCTT 
TTTTTACTTGATGGATATCCACGTACTATTGAGCAAGCACACGCCTTAGATGCTACGCTT 
TTTTTACTTGATGGATATCCACGTACTATTGAGCAAGCACACGCCTTAGATGCTACGCTT 

GAAGAACTAGGACTACGCTTAGATGGTGTTATTAATATTAAAGTGGATCCATCATGTCTT 
GAAGAACTAGGACTACGCTTAGATGGTGTTATTAATATTAAAGTGGATCCATCATGTCTT 
GAAGAACTAGGACTACGCTTAGATGGTGTTATTAATATTAAAGTGGATCCATCATGTCTT 
GAAGAACTAGGACTACGCTTAGATGGTGTTATTAATATTAAAGTGGATCCATCATGTCTT 
GAAGAACTAGGACTACGCTTAGATGGTGTTATTAATATTAAAGTGGATCCATCATGTCTT 
GAAGAACTAGGACTACGCTTAGATGGTGTTATTAATATTAAAGTGGATCCATCATGTCTT 
GAAGAACT AGGACTACGCTT AG ATGGTGTTATTAATATTAAAGTGG ATCCAT CATGT CTT 
GAAGAACTAGGACTACGCTTAGATGGTGTTATTAATATTAAAGTGGATCCAACATGCCTT 
GAAGAACTAGGACTACGCTTAGATGGTGTTATTAATATTAAAGTGGATCCATCATGTCTT 
GAAGAACTAGGACTACGCTTAGATGGTGTTATTAATATTAAAGTGGATCCATCATGTCTT 
GAAGAACTAGGACTACGCTTAGATGGTGTTATTAATATTAAAGTGGATCCAACATGCCTT 
GAAGAACTAGGACTACGCTTAGATGGTGTTATTAATATTAAAGTGGATCC^ACATGCCTT 

ATAGAGCGTTTGAGTGGTCGTATTATCAATCGTAZU^ACTGGTGAAACTTTCCACAAAGTG 
ATAGAGCGTTTGAGTGGTCGTATTATCAATCGTAAAACTGGTGAAACTTTCCACAAAGTG 
ATAGAGCGTTTGAGTGGTCGTATTATCAATCGTAAAACTGGTGAAACTTTCCACAAAGTG 
ATAGAGCGTTTGAGTGGTCGTATTATCAATCGTAAAACTGGTGAAACTTTCCACAAAGTG 
ATAGAGCGTTTGAGTGGTCGTATTATCAATCGTAAAACTGGTGAAACTTTCCACAAAGTG 
ATAGAGCGTTTGAGTGGTCGTATTATCAATCGTAAAACTGGTGAAACTTTCCACAAAGTG 
ATAGAGCGTTTGAGTGGTCGTATTATCAATCGTAAAACTGGTGAAACTTTCCACAAAGTG 
ATAGAGCGTTTGAGTGGCCGTATTATCAATCGTAAAACTGGTGAAACTTTCCACAAAGTG 
ATAGAGCGTTTGAGTGGTCGTATTATCAATCGTAAAACTGGTGAAACTTTCCACAAAGTG 
ATAGAGCGTTTGAGTGGTCGTATTATCAATCGTAAAACTGGTGAAACTTTCCACAAAGTG 
ATAGAGCGTTTGAGTGGCCGTATTATCAATCGTAAAACTGGTGAAACTTTCCACAAAGTG 
ATAGAGCGTTTGAGTGGCCGTATTATCAATCGTAAAACTGGTGAAACTTTCCACAAAGTG 
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T^fe21: 

SEQ2102 
SEQ2103 
SEQ2104 
SBQ2105 
SEQ2106 
SEQ2107 
SEQ2108 
SEQ2109 
SEQ2110 
SEQ2111 
SEQ2112 



SEQ2101 
SEQ2102 
SEQ2103 
SEQ2104 
SEQ2105 
SEQ2106 
SEQ2107 
SEQ2108 
SEQ2109 
SEQ2110 
SEQ2111 
SEQ2X12 

SEQ2101 
SEQ2102 
SEQ2103 
SEQ2104 
SEQ2105 
SEQ2106 
SEQ2107 
SEQ2108 
SEQ2109 
SEQ2110 
SEQ2111 
SEQ2112 

SEQ2101 
SEQ2102 
SEQ2103 
SEQ2104 
SEQ2105 
SEQ2106 
SEQ2107 
SEQ2108 
SEQ2109 
SEQ2110 
SEQ2111 
SEQ2112 



Comparative Sequences relating to SAG0079 (adenylate kinase) 



TTCAACCCACCAGTAGATTATAAAGAAGAAGATTACTATCAACGTGAAGATGATAAGCCT 
TTCAACCCACCAGTAGATTATAAAGAAGAAGATTACTATCAACGTGAAGATGATAAGCCT 
TTCAACCCACCAGTAGATTATAAAGAAGAAGATTACTATCAACGTGAAGATGATAAGCCT 
TTCAACCCACCAGTAGATTATAAAGAAGAAGATTACTATCAACGTGAAGATGATAAGCCT 
TTCAACCCACCAGTAGATTATAAAGAAGAAGATTACTATCAACGTGAAGATGATAAGCCT 
TTCAACCCACCAGTAGATTATAAAGAAGAAGATTACTATCAACGTGAAGATGATAAGCCT 
TTCAACCCACCAGTAGATTATAAAGAAGAAGATTACTATCT^ACGTGAAGATGATAAGCCT 
TTCAACCCACCAGTAGATTATAAAGAAGT^AGATTACTATCAACGTGAAGATGATAAGCCT 
TTCAACCCACCAGTAGATTATAAAGAAGAAGATTACTATCAACGTGAAGATGATAAGCCT 
TTCAACCCACCAGTAGATTATAAAGAAGAAGATTACTATCAACGTGAAGATGATAAGCCT 
TTCAACCCACCAGTAGATTATAAAGAAGAAGATTACTATCAACGTGAAGATGATAAGCCT 
TTCAACCCACCAGTAGATTATAAAGAAGAAGATTACTATCAACGTGAAGATGATAAGCCT 

GAAACTGTCAAACGTCGCTTGGACGTTAATATTGCTCAAGGAGAACCTATTCTTGAACAC 
GAAACTGTCAAACGTCGCTTGGACGTTAATATTGCTCAAGGAGAACCTATTCTTGAACAC 
GAAACTGTCAAACGTCGCTTGGACGTTCATATTGCTCAAGGAGAACCTATTCTTGAACAC 
GAAACTGTCAAACGTCGCTTGGACGTTAATATTGCTCAAGGAGAACCTATTCTTGAACAC 
GAAACTGTCAAACGTCGCTTGGACGTTAATATTGCTCAAGGAGAACCTATTCTTGAACAC 
GAAACTGTCAAACGTCGCTTGGACGTTAATATTGCTCAAGGAGAATCTATTCTTGAACAC 
GAAACTGTCAAACGTCGCTTGGACGTTAATATTGCTCAAGGAGAACCTATTCTTGAACAC 
GAAACTGTCAAACGTCGCTTGGACGTTAATATTGCTCAAGGAGAACCTATTCTTGAACAC 
GAAACTGTCAAACGTCGCTTGGACGTTAATATTGCTCAAGGAGAATCTATTCTTGAACAC 
GAAACTGTTAAACGTCGCTTGGACGTTAATATTGCTCAAGGAGAACCTATTCTTGAACAC 
GAAACTGTCAAACGTCGCTTGGACGTTAATATTGCTCAAGGAGAACCTATTCTTGAACAC 
GAAACTGTCAAACGTCGCTTGGACGTTAATATTGCTCAATABCMARATVSTNCSR AT 

ATCGTAAGCTTGGTCTTGTTACAGATATTGAAGGTAATCflAGAAATAACAGAAGTTTTT 
ATCGTAAGCTTGGTCTTGTTACAGATATTGAAGGTAATCAAGAAATAACAGAAGTTTTT 

ATAGTAAGCTTGGCCTTGTTACAGATATTGAAGGTAATCAAGAAATAA 

ATCGTAAGCTTGGTCTTGTTACAGATATTGAAGGTAATCAAGAAATAACAGAAGTTTTT 
ATCGTAAGCTTGGTCTTGTTACAGATATTGAAGGTAATCAAGAAATAACAGAAGTTTTT 
ATCGAAAGCTTGGTCTTGTTACAGATATTGAAGGTAA 

ATAG 

ATCGTAAGCTTGGTCTTGTTACAGATATTGAAGGTAATCAAGAAATAACAGAAGTTTTT 
ATCGTAAGCTTGGTCTTGTTACAGATATTGAAGGTAATCAAGAAATAACAGAAGTTTTT 

ATAAAAAGCTTGGTCTTGTTACAGATATTGAAGGTAATCA 

ATCGTAAGCTTGGTCTTGTTACAGATATTGAAGGTAATCAAGAAATAACAGAAGTTTTT 
GTSAGADNYATKNAS 

CAGATGTTGAAAAAGCGTTG 

CAGATGTTGAAAAAGCGTTGCTAGAACTCAAA 



CAGATGTTGAAAAAGCGTTGCTAGAA- 
CAGATGTTGAAAAAGCGTTG 



CAGATGTTGAAAAAGCGTTGCTAG- 
CAGATGTTGAAAAAGCGTTGCT 



CAGATGTTGAAAAAGCGTT GCTAGAACT C AAA 



4 




T^^tl: C mparative Sequences relating to SAG0079 (adenylate Wnas ) 

>SEQ ID NO 2150:090 frame: 1 

NLLIMGLPGAGKGTQAAKI VEEFGVAHI STGDMFRAAMANQTEMGRLAKSYI DKGELVPD 
EVTNGIVKERLAEDDIAEKGFLLDGYPRTIEQAHALDATLEELGLRLDGVINIKVDPSCL 
IERLSGRIINRKTGETFHKVFNPPVDYKEEDYYQREDDKPETVKRRLDVNIAQGEPILEH 
YRKLGLVTDIEGNQEITEVFADVEKALLELK 

>SEQ ID NO 2151:114_1169NT frame: 2 

GKGTQAAKI VEEFGVAH I STGDMFRAAMANQTEMGRLAKSY I DKGE LVP DQVTN G I VKER 
LAEDDIAEKGFLLDGYPRTIEQAHALDATLEELGLRLDGVINIBCVDPSCLIERLSGRIIN 
RKTGETFHKVraPPVDYKEEDYYQREDDKPETVKRRLDVHIAQGEPILEHYSKLGLVTDI 

EGNQEI 

>SEQ ID NO 2152: 114_18RS21 frame: 1 

NLLTTGSPGAGKGTQAAKIVEEFGVAHISTGDMFRAAMANOTEMGRLAKSYIDKGELVPD 
EVTNGIVKERLAEDDIAEKGFLLDGYPRTIEQAHALDATLEELGLRLDGVINIKVDPSCL 
IERLSGRIINRKTGETFHKVFNPPVDYKEEDYYQREDDKPETVKRRLDVNIAQGEPILEH 
YRKLGLVTDIEGNQJBITEVFADVEKALLE 

>SEQ ID NO 2153: 114_2603 frame: 1 

NLLIMGLPGAGKGTQAAKIVEEFGVAHISTGDMFRAAMANQTEMGRLAKSYIDKGELVPD 
EVTNGIVKERLAEDDIAEKGFLLDGYPRTIEQAHALDATLEELGLRLDGVINIKVDPSCL 
IERLSGRI INRKTGET FHKVFNPPVDYKEEDYYQRE DDKPETVKRRLDVNIAQGE PILEH 
YRKLGLVTDIEGNQEITEVFADVEKAL 

>SEQ ID NO 2154: 114JA909 frame: 1 

NLLIMGLPGAGKGTQAAKIVEEFGVAHISTGDMFRAAMANQTEMGRLAKSYIDKGELVPD 
EVTNGIVKERLAEDDIAEKGFLLDGYPRTIEQAHALDATLEELGLRLDGVINIKVDPSCL 
IERLSGRIINRKTGETFHKVFNPPVDYKEEDYYQREDDKPETVKRRLDVNIAQGESILEH 

YRKLGLVTDIEG 

>SEQ ID NO 2155:114 A909 frame: 1 

NLLIMGLPGAGKGTQAAKIVEEFGVAH I STGDMFRAAMANQTEMGRLAKSYI DKGELVPD 
EVTNGIVKERLAEDDIAEKGFLLDGYPRTIEQAHALDATLEELGLRLDGVINIKVDPSCL 
IERLSGRIINRKTGET FHKVFNPPVDYKEEDYYQRE DDKPETVKRRLDVNI AQGES I LEH 
YRKLGLVTDIEG 

>SBQ ID NO 2156: 114JCJB110 frame: 1 

NLLTTGLLGAGKGTQAAKI VEE FGVAHI STGDMFRAAMANQTEMGRLAKSYI DKGELVPD 
EVTNGIVKERLAEDDIAEKGFLLDGYPRTIEQAHALDATLEELGLRLDGVINIKVDPSCL 
IERLSGRIINRKTGETFHKVFNPPVDYKEEDYYQREDDKPETVKRRLDVNIAQGEPILEH 

Y 

>SEQ ID NO 2157: 114JCOH1 frame: 3 

LLIMGLPGAGKGTQAAKIVEEFGVAHISTGDMFRAAMANQTQMGRLAKSYIDKGELVPDE 
VTNGIVKERLAEDDIAEKGFLLDGYPRTIEQAHALDATI£ELGLRLDGVINIKVDPTCLI 
ERLSGRIINRKTGETFHKVFNPPVDYKEEDYYQREDDKPETVKRRLDVNIAQGEPILEHY 

RKLGLVT DI EGNQE ITEVFADVEKALL 

>SEQ ID NO 2158: 114_H36B frame: 3 

GDMFRAAMANQTEMGRLAKSYIDKGELVPDEVTNGIVKERLAEDDIAEKGFLLDGYPRTI 
EQAHALDATLEELGLRLDGVINIKVDPSCLIERLSGRIINRKTGETFHKVFNPPVDYKEE 
DYYQRE DDKPETVKRRLDVN IAQGE S ILEHYRKLGLVTDI EGNQEITEVFADVEKAL 

>SEQ ID NO 2159: 114_ JM9130013 frame: 1 

NLLIMGLPGAGKGTQAAKI VEEFGVAHI STG DM FRAAMAN QTEMGRLAKSYI DKGELV PD 
EVTNGIVKERIAEDDIAEKGFLLDGYPRTIEQAHALDATLEELGLRLDGVINIKVDPSCL 
IERLSGRI INRKTGETFHKVFNPPVDYKEEDYYQREDDKPETVKRRLDVN IAQGE PI LEH 

YKKLGLVTDIEGN 

>SEQ ID NO 2160:114_M732 frame: 1 

LLIMGLPGAGKGTQAAKIVEEFGVAHISTGDMFRAAMANQTQMGRLAKSYIDKGELVPDE 
VTNGIVKERIAEDDIAEKGFLLDGYPRTIEQAHALDATLEELGLRLDGVINIKVDPTCLI 
ERLSGRI INRKTGETFHKV FN P PVDYKEEDYYQREDDKPETVKRRLDVN IAQGEPILEHY 
RKLGLVTDIEGNQEITEVFADVEKALLELK 

>SEQ ID NO 2161: 114_M781 frame: 1 

NLLITGLPGAGKGTQAAKIVEEFGVAHISTGDMFRAAMANQTQMGRLAKSYIDKGELVPD 
EVTNGIVKERLAEDDIAEKGFLLDGYPRTIEQAHALDATLEELGLRLDGVINIKVDPTCL 
IERLSGRIINRKTG^TFHKVFNPPVDYKEEDYYQREDDKPETVKRRLDVNIAQ 
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<fe i3«i tg & *U A Is ^ i2 7 O tSS 
Comparative Sequences relating to SAG0079 (adenylate kinase) 

LLIMGLPGAGKGTQAAKIVEEFGVAHISTGDMFRAAMANQTEMGRLAKSYIDKGELVPD 

GKGTQAAKIVEEFGVAHI STGDMFRAAMANQTEMGRLAKSYI DKGELVPD 

LLTTGSPGAGKGTQAAKIVEEFGVAHISTGDMFRAAMANQTEMGRIAKSYIDKGELVPD 
LLIMGLPGAGKGTQAAKI VEE FGVAHI STGDMFRAAMANQTEMGRLAKSY I DKGELVPD 
LLIMGLPGAGKGTQAAKI VEE FGVAHI STGDMFRAAMANQTEMGRLAKSY I DKGELVPD 
LLIMGLPGAGKGTQAAKI VEE FGVAHI STG DMFRAAMANQTEMGRLAKS Y I DKGELVPD 
LLTTGLLGAGKGTQAAKI VEE FGVAHI STGDMFRAAMANQTEMGRLAKSY I DKGELVPD 
LLIMGLPGAGKGTQAAKIVEEFGVAHISTGDMFRAAMANQTQMGRLAKSYIDKGELVPD 

GDMFRAAMANQTEMGRLAKSYIDKGELVPD 

LLIMGLPGAGKGTQAAKIVEEFGVAHISTGDMFRAAMANQTEMGRIAKSYIDKGELVPD 
LLIMGLPGAGKGTQAAKI VEEFGVAHI STGDMFRAAMANQTQMGRLAKSYI DKGELVPD 
LLITGLPGAGKGTQAAKIVEEFGVAHISTGDMFRAAMANQTQMGRLAKSYIDKGELVPD 

EVTNGIVKERLAEDDIAEKGFLLDGYPRTIEQAHALDATLEELGLRLDGVINIKVDPSCL 
QVTNGIVKERLAEDDIAEKGFLLDGYPRTIEQAHALDATLEELGLRLDGVINIKVDPSCL 
EVTNGIVKERLAEDDIAEKGFLLDGYPRTIEQAHALDATLEELGLRLDGVINIKVDPSCL 
EVTNGIVKERLAEDDIAEKGFLLDGYPRTIEQAHALDATLEELGLRLDGVINIKVDPSCL 
EVTNGIVKERLAEDDIAEKGFLLDGYPRTIEQAHALDATLEELGLRLDGVINIKVDPSCL 
EVTNG I VKERLAE DDIAEKGFLLDGYPRT IEQAHALDATLEELGLRLDGVIN IKVD PSCL 
EVTNGI VKERLAEDDI AEKGFLLDGY PRT IEQAHALDATLEELGLRLDGVIN IKVDPS CL 
EVTNGIVKERLAEDDIAEKGFLLDGYPRTIEQAHALDATLEELGLRLDGV IN I KVDPTCL 
EVTNGIVKERLAEDDIAEKGFLLDGYPRTIEQAHALDATLEELGLRLDGVINIKVDPSCL 
EVTNG IVKERIjAEDDIAEKGFLLDGYPRT IEQAHALDATLEELGLRLDGVIN IKVDPSCL 
EVTNGIVKERLAEDDIAEKGFLLDGYPRTIEQAHALDATLEELGLRLDGVINIKVDPTCL 
EVTNGIVKERLAEDDIAEKGFLLDGYPRTIEQAHALDATLEELGLRLDGVINIKVDPTCL 

IERLSGRIINRKTGETFHKVFNPPVDYKEEDYYQREDDKPETVKRRLDVNIAQGEPILEH 
IERLSGRIINRKTGETFHKVFNPPVDYKEEDYYQREDDKPETVKRRLDVHIAQGEPILEH 
IERLSGRIINRKTGETFHKVFNPPVDYKEEDYYQREDDKPETVKRRLDVNIAQGEPILEH 
IERLSGRIINRKTGETFHKVFNPPVDYKEEDYYQREDDKPETVKRRLDVNIAQGEPILEH 
IERLSGRI INRKTGET FHKVFN PPVDYKEEDYYQRED DKPETVKRRLDVN I AQGE S ILEH 
IERLSGRIINRKTGETFHKVFNPPVDYKEEDYYQREDDKPETVKRRLDVNIAQGESILEH 
IERLSGRIINRKTGETFHKVFNPPVDYKEEDYYQREDDKPETVKRRLDVNIAQGEPILEH 
IERLSGRIINRKTGETFHKVFNPPVDYKEEDYYQREDDKPETVKRRLDVNIAQGEPILEH 
IERLSGRI INRKTGET FHKVFN PPVDYKEEDYYQREDDKPETVKRRLDVN I AQGE S I LEH 
IERLSGRI INRKTGET FHKVFNPPVDYKEEDYYQREDDKPETVKRRLDVNIAQGEPILEH 
IERLSGRIINRKTGETFHKVFNPPVDYKEEDYYQREDDKPETVKRRLDVNIAQGEPILEH 
IERLSGRI INRKTGETFHKVFN PPVDYKEEDYYQREDDKPETVKRRLDVN IAQ 

RKLGLVTDIEGNQEITEVFADVEKALLELK 

SKLGLVTDIEGNQEI 

RKLGLVT DIEGNQE ITEVFADVEKALLE — 

RKLGLVTDIEGNQEITEVFADVEKAL 

RKLGLVT DIEG 

RKLGLVT DIEG 

RKLGLVTDIEGNQEITEVFADVEKALL 

RKLGLVTDIEGNQEITEVFADVEKAL 

KKLGLVT DIEGN 

RKLGLVTDIEGNQEITEVFADVEKALLELK 
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Table 22: Comparative Sequences relating t SAG0093 
(D-alanyl-D-alanine carboxypeptidase family protein) 





SEQ ID NO- 2201: SAG 00 93 FROM THE 090 GBS TYPE la STRAIN (REVERSE COMPLEMENT) 

j^GCCTAACAGTCAACAATCATCATCTCAAAAGTTGAGGAATGAGGATATAAAAAAGATATCCT 
CAATTACCAGCTGTATCATCAAAAGATTGGAACTTGATTTTGGTCAATC^ 

CCTGTTGAAAATATTTATTTGGATAAACGTATTACGAAGCAAGCTACTCAGTTTTTAGAGGCTGCTAGAGCAATTGATTCACGAGAA 
CATTTAATTTCGGGTTATCGTAGTGTTGCCTATCAGGAGAAGTTGTTCAATTCTTATGTTACTCAAGAGATGACTAGTAACCCTAAT 
TTGACGAGGGGACAAGCAGAAAAGTTGGTAAAAACTTACTCTCAGCCTGCAGGTC 

ATGAGTACTGTAGATTCTTTGAATGAGAGCGATCCTAGAGTAGTCAGTCAGTTGAAAAAGATAGCTCCACAATATGGTT^ 

CGGTTTCCGGATGGTAAAACAGCAGAAACAGGGGTAGGTTATGAAGATTGGCATTACCGCTATGTTGGGGTAGAGTCTGCAAAATAT 

ATGGCCAAACATCATTTAACATTAGT^AGAATACATAACTTTATTAAAGGAGAATAACCAA 

SEQ ID NO. 2202: SAG0093 FROM THE 1169NT1 GBS TYPE V STRAIN (REVERSE COMPLEMENT) 

AAGCCTAACAGTCAACAATCATCACCTCAAAAGTTGAGGAATGAGGATATAAAAAAGATATCCT 

CGATTACCAGCTGTATCATCAAAAGATTGGAACTTGATTTTGGTCAATCGTGACCATAAACATGAAGAATTAAGTCCAGATGTGGTG 
CCTGTTGAAAATATTTATTTGGATAAACGTATTACGAAGCAAGCTACTCAGTTTTTAGAGGCTGCTAGAGCAATTGATTCACGAGAA 
CATTTAATTTCGGGTTATCGTACTGTTGCCTATCAGGAGAAGTTGTTCAATTCTTATGTTACTCAAGAGATGACTAGTAACCCTAAT 
TTGACGAGGGGACAAGCAGAAAAGTTGGTAAAAACTTACTCTCAGCCTGCAGGTGCTAGTGAACACCAGACTGGATTAGCGATGGAT 
ATGAGTACTGTAGATTCTTTGAATGAGAGCGATCCTAGAGTAGTCAGTCAGTTGAAAAAGATAGCTCCACAATATGGTTTTGTCTTA 
CGGTTTCCGGATGGTAAAACAGCAGAAACAGGGGTAGGTTATGAAGATTGGCATTACCGCTATGTTGGGGTAGAGTCTGCAAAATAT 
ATGGCCGAACATCGTTTAACATTAGAAGAATACATAACTTTATTAAAGGAGAATAACCAA 

SEQ ID NO. 2203: SAG0093 FROM THE 18RS21 GBS TYPE II STRAIN 

AAGCCTAACAGTCAACAATCATCATCTC^U\AAGTTGAGGAATGAGGATATAAAAAAGATATCCTCTCAAAAAAGAAATAAGAAATTA 
CAATTACCAGCTGTATCATCAAAAGATTGGAACTTGATTTTGGTCAATCGTGACCATAAACATGAAGAATTAAGTCCAGATGTGGTT 
CCTGTTGAAAATATTTATTTGGATAAACGTATTACGAAGCAAGCTACTCAGTTTTTAGAGGCTGCTAGAGCAATTGATTCACGAGAA 
CATTTAATTTCGGGTTATCGTAGTGTTGCCTATCAGGAGAAGTTGTTCAATTCTTATGTTACTCAAGAGATGACTAGTAACCCTAAT 
TTGACGAGGGGACAAGCAGAAAAGTTGGTAAAAACTTACTCTCAGCCTGCAGGTGCTAGTGAACACCAGACTGGATTAGCGATGGAT 
ATGAGTACTGTAGATTCTTTGAATGAGAGCGATCCTAGAGTAGTCAGTC^GTTGAAAAAGATAGCT 

CGGTTTCCGGATGGTAAAACAGCAGAAACAGGGGTAGGTTATGAAGATTGGCATTACCGCTATGTTGGGGTAGAGTCTGCAAAATAT 
ATGGCCAAACATCATTTAACATTAGAAGAATACATAACTTTATTAAAGGAGAATAACCAA 

SEQ ID NO. 2204: SAG0093 FROM THE 2603V/R GBS TYPE V STRAIN 

ACAGTCAACAATCATCATCTCAAAAGTTGAGGAATGAGGATATAAAAAAGATATCCTCTCAAAAAAGAAATAAGAAATTACAATTAC 
CAGCTGTATCATCAAAAGATTGGAACTTGATTTTGGTCAATCGTGACCATAAACATGAAGAATTAAGTCCAGATGTGGTTCCTGTTG 
AAAATATTTATTTGGATAAACGTATTACGAAGCAAGCTACTCAGTTTTTAGAGGCTGCTAGAGCAATTGATTCACGAGAACATTTAA 
1TTCGGGTTATCGTAGTGTTGCCTATCAGGAGAAGTTGTTCAATTCTTATGTTACTCAAGAGATGACTAGTAACCCTAATTTGACGA 
GGGGACAAGCAGAAAAGTTGGTAAAAACTTACTCTCAGCCTGCAGGTGCTAGTGAACACCAGACTGGATTAGCGATGGATATGAGTA 
CTGTAGATTCTTTGAATGAGAGCGATCCTAGAGTAGTCAGTCAGTTGAAAAAGATAGCTCCACAATATGGTTTTGTCTTACGGTTTC 
CGGATGGTAAAACAGCAGAAACAGGGGTAGGTTATGAAGATTGGCATTACCGCTATGTTGGGGTAGAGTCTGCAAAATATATGGCCA 
AACATraTTTAACATTAGAAGAATACATAACTTTATTAAAGGAGAATAACCAAAACCCAGCTTTCTTGTACAA 

SEQ ID NO. 2205: SAGO 093 FROM THE A909 GBS TYPE la STRAIN 

AAGCCTAACAGTGAAC^XATCATCATCTCAAAAGTTGAGGAATGAGGATATAAAAAAGAC^ 
CGATTACCAGCTGTATC^TCAAAAGATTGGAACTTGATTTTGG 

CCTGTTGAAAATATTTATTTGGATAAACGTATTACGAAGCAAGCTACTCAGTTTTTAGAGGCTGCTAGAGCAATTGATTCACGAGAA 

CATTTAATTTCGGGTTATCGTAGTGTTGCCTATCAGGAGAAGTTGTTCAATTCTTATGTTACTCAAGAAATGACTAGTAACCCTAAT 

TTGACGAAGGAACAAGCAGAAAAGTTGGTAAAAACTTACTCTCAGCCTGCAGGTGCTAGTGAACACCAGACTGGATTAGCGATGGAT 

ATGAGTACTGTAGATTCTTTGAATGAGAGCGATCCTAGAGTAGTCAGTCAGTTGAAAAAGATAGCTCCACAATATGGTTTTGTCTTA 

CGGTTTCCGGATGGTAAAACAGCAGAAACAGGGGTAGGTTATGAAGATTGGCATTACCGCTA^ 

ATGGCCA7UVCATCATTTAAC7VTTAGAAGAATACATAACTTTATTAAAGGAGAATAACCAA 

SEQ ID NO. 2206: SAG0093 FROM THE CJB110 GBS NONTYPEABLE STRAIN 

AAGCCTAACAGTCAACAATCATCATCTCAAAAGTTGAGGAATGAGGATATAAAAAAGATATCCTCTCAAAAAAGAA^ 

ACAATTACCAGCTGTATCATCAAAAGATTGGAACTTGATTTTGGTCAATCGTGACCATAAACATGAAGAATTAAGTCCAGATGTG 

TCCTGTTGAAAATATTTATTTGGATAAACGTATTACGAAGCAAGCTACTCAGTTTTTAGAGGCTGCTAGAGCAATTGATTCACGAGA 

ACATTTAATTTCGGGTTATCGTAGTGTTGCCTATCAGGAGAAGTTGTTCAATTCTTATGTTACTCAAGAGATGACTAGTAACCCTAA 

TTTGACGAGGGGAC^yVGCAGAAAAGTTGGTAAAAACTTACTCTCAGCCTGCAGGTGCTAGTGAACACCAGACT 

TATGAGTACTGTAGATTCTTTGAATGAGAGCGATCCTAGAGTAGTCAGTCAGTTGARAAAGATAGCTCC^CAATATC 

ACGGTTTCCGGATGGTAAAACAGCAGAAAC^GGGGTAGGTTATGAAGATTG^ 

TATQGCCAAACATCATTTAACATTAGAAGAATACATAACTTTATTAAAGGAGAATAACCAA 
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Table 22: Comparative Sequences relating to SAGOO^ 
(D-alanyl-D~alanine carboxypeptidase family protein) 

SEQ ID NO. 2207: SAG0093 FROM THE COH1 GBS TYPE XIX STRAIN 

CCTAACAGTGAACAATCATCATCTCAAAAGTTGAGGAATGAGGATATA 

ATTACCMCTGTATCATCAAAAGATTGGAACTTGATTTTGGTC^^ 

TGTTGAAAATATTTATTTGGATAAACGTATTACGAAGCAAGCTACTC^GTTTT^ 

TTTAATTTCGGGTTATCGTAGTGTTGCCTATCAGGAGAAGTTGTTCAATTCTTATGTTACTCAAGAGATGACTAGTAACCCTAATTT 
GACGAGGGGACAAGCAGAAAAGTTGGTAAAAACTTACTCTCAGCCTGCAGGTGCTAGTGAACACCAGACTGGATTAGCGATGGATAT 
GAGTACTGTAGATTCTTTGAATGAGAGCGATCCTAGAGTAGTCAGTC^GTTGAAAAAGATAGCTCCACAATATGGTTTTGTCTTACG 
GTTTCCGGATGGTAAAACAGCAGAAACAGGGGTAGGTTATGAAGATTGGCATTACCGCTATGTTGGGGTAGAGTCTGCAAAATATAT 
GGTCAAACATCATTTAACATTAGAAGAATACATAACTTTATTAAAGGAGAAT^ 

SEQ ID NO. 2208: SAG0093 FROM THE H36b GBS TYPE lb STRAIN 

AAGCCTAACAGTCAACAATCATCATCTCAAAAGTTGAGGAATGAGGATATAAAAAAGACATC^ 
CGATTACCAGCTGTATCATCAAAAGATTGGAACTTGATTTTGGTCAATCGTGACCATAAA^ 

CCTGTTGAAAATATTTATTTGGATAAACGTATTACGAAGCAAGCTACTCAGTTTTTAGAGGCTGCTAGAGCAATTGATTCACGAGAA 

CATTTAATTTCGGGTTATCGTAGTGTTGCCTATCAGGAGAAGTTGTTCAATT 

TTGACGAAGGAACAAGCAGAAAAGTTGGTAAAAACTTACTCTCAGCCTGCAGGTGCTAGTG^ 

ATGAGTACTGTAGATTCTTTGAATGAGAGCGATCCTAGAGTAGTC^GTCAGTTGAAAAAGATAGCTCCACAATATGGTTTT 

CGGTTTCCGGATGGTAAAACAGC^AAACAGGGGTAGGTTATGAAGATTGGCATTACCGCTATGTTGGGGTAGACT 

ATGGCCAAACATCATTTAACATTAGAAGAATACATAACTTTATTAAAGGAGAATAACCAA 

SEQ ID NO. 2209: SAG0093 FROM THE JM9130013 GBS TYPE VIII STRAIN 

AAGCCTAACAGTCAACAATCATCATCTCAAAAGTTGAGGAATGAGGATATAAAA 
CAATTACCAGCTGTATCATCAAAAGATTGGAACTTGATTTTGGTCAATCGTGACCAT^^ 

CCTGTTGAAAATATTTATTTGGATAAACGTATTACGAAGC^UVGCTACTCAGTTTTTAGAGGCTGCTAGAGCTUITTGATTCACGAGAA 

C^TTTAATTTCGGGTTATCGTAGTGTTGCCTATCAGGAGAAGTTGTTCAATTCTTATGTTA^ 

TTGACGAGGGGACAAGCAGAAAAGTTGGTAAAAACTTACTCTGAGCCTGGAGGTGCT 

ATGAGTACTGTAGATTCTTTGAATGAGAGCGATCCTAGAGTAGTCAGTCAGTTGAAAAAGATAGCTCGACAATATGGTTTTGTCTTA 
CGGTTTCCGGATGGTAAAACAGCAGAAACAGGGGTAGGTTATGAAGATTGGCATTACCGCTATGTTGGGGTAGAGTCTGCAAAATAT 
ATGGCCAAACATCATTTAACATTAGAAGAATACATAACTTTATTAAAGGAGAATAACCAA 

SEQ ID NO. 2210: SAG0093 FROM THE M732 GBS TYPE III STRAIN 

AGCCTAAC^GTCAACAATCATCAT CTCAAAAGT TGAGGAATGAGGAT ATAAAAAAGAC ATCCTCT CAAAAAAGAAATAAGAAATTAC 
GATTACCAGCTGTATCATCAAAAGATTGGAACTTGATTTTGGTCAATCGTGACCATAAACATGAAGAATTAAGTCCAGATGTGGTGC 
CTGTTGAAAATATTTATTTGGATAAACGTATTACGAAGCAAGCTACTCAGTTT^ 

ATTTAATTTCGGGTTATCGTAGTGTTGCCTATCAGGAGAAGTTGTTCAATTCTTATGTTACTCAAGAGATGACTA 

TGACGAGGGGACAAGCAGAAAAGTTGGTAAAAACTTACTCTGAGCCTGCAGGTC 

TGAGTACTGTAGATTCTTTGAATGAGAGCGATCCTAGAGTAGTCAGTCAGTTGAAAAAGATAGCTC 

GGTTTCCGGATGGTAAAACAGCAGAAACAGGGGTAGGTTATGAAGATTGGCATTACCGCTATGTTGGGGTAGAGTCTGCAAAATATA 
TGGTCAAACATCATTTAACATTAGAAGAATACATAACTTTATTAAAGGAGAATAACCAAAACCCAGCTTTCTT 

SEQ ID NO. 2211: SAGO 093 FROM THE M761 GBS TYPE III STRAIN 

AAGCCTAACAGTCAACAATCATCAT CTCAAAAGTTGAGG AATGAGGATAT AAAAAAGACATCCTCT CAAAAAAGAAAT AAGAAATT A 

CGATTACCAGCTGTATCATCAAAAGATTGGAACTTGATTTTGGTCAATCGTGA^^ 

CCTGTTGAAAATATTTATTTGGATAAACGTATTAC&AAGCAAGCTACTCAGTTTO 

CATTTAATTTCGGGTTATCGTAGTGTTGCCTATCAGGAGAAGTTCT^ 

TTGACGAGGGGACAAGCAGAAAAGTTGGTAAAAACTTACTCTCAGCCTGCAGGTC 

ATGAGTACTGTAGATTCTTTGAATGAGAGCGATCCTAGAGTAGTC7VGTCACT 

CGGTTTCCGGATGGTAAAACAGCAGAAACAGGGGTAGGTTATGAAGACT^ 

ATGGTCAAACATCATTTAACATTAGAAGAATACATAACTTTATTAAAGGAGAATAACCAA 



SEQ22 01 AGCCTAACAGTCAACAATCATCATCTCAAAAGTTGAGGAATGAGGATATAAAAAAGATA 

SEQ2202 AGCCTAACAGTCAACAATCATCACCTCAAAAGTTGAGGAATGAGGATATAAAAAAGATA 

SEQ2203 AGCCTAACAGTCAACAATCATCATCTCAAAAGTTGAGGAATGAGGATATAAAAAAGATA 

SEQ2204 ACAGTCAACAATCATCATCTCAAAAGTTGAGGAATGAGGATATAAAAAAGATA 

SEQ2205 AGCCTAACAGTCAACAATCATCATCTCAAAAGTTGAGGAATGAGGATATAAAAAAGACA 

SEQ2206 AGCCTAACAGTCAACAATCATCATCTCAAAAGTTGAGGAATGAGGATATAAAAAAGATA 

SEQ2207 — CCTAACAGTCAACAATCATCATCTCAAAAGTTGAGGAATGAGGATATAAAAAAGACA 

SEQ2208 AGCCTAACAGTCAACAATCATCATCTCAAAAGTTGAGGAATGAGGATATAAAAAAGACA 

SEQ22 0 9 AGCCTAACAGTCAACAATCATCATCTCAAAAGTTGAGGAATGAGGATATAAAAAAG ATA 

SEQ2210 AGCCTAACAGTCAACAATCATCATCTCAAAAGTTGAGGAATGAGGATATAAAAAAGACA 

SEQ2211 AGCCTAACAGTCAACAATCATCATCTCAAAAGTTGAGGAATGAGGATATAAAAAAGACA 
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Tab! 22: C mparative Sequences relating to SAG009 
(D-alanyl-D-alanine carboxypeptidase family protein) 



TCCTCTCAAAAAAGAAAT-AAGAAATT-ACAATTACCAGCTGTATCATCAAAAGATTGGA 
TCCTCTCAAAAAAGAAAT-AAGAAATT-ACGATTACCAGCTGTATCATCAAAAGATTGGA 
TCCTCTCAAAAAAGAAAT-AAGAAATT-ACAATTACCAGCTGTATCATCAAAAGATTGGA 
TCCTCTCAAAAAAGAAAT-AAGAAATT— ACAATTACCAGCTGTATCATCAAAAGATTGGA 
TCCTCTCAAAAAAGAAAT-AAGAAATT-ACGATTACCAGCTGTATCATCAAAAGATTGGA 
TCCTCTCAAAAAAGAAAT-AAGAAATTTACAATTACCAGCTGTATCATCAAAAGATTGGA 
TCCTCTCAAAAAAGAAATTAAGAAATT-ACGATTACCAGCTGTATCATCAAAAGATTGGA 
TCCTCTCAAAAAAGAAAT-AAGAAATT-ACGATTACCAGCTGTATCATCAAAAGATTGGA 
TCCTCTCAAAAAAGAAAT-AAGAAATT-ACAATTACCAGCTGTATCATCAAAAGATTGGA 
TCCTCTCAAAAAAGAAAT-AAGAAATT-ACGATTACCAGCTGTATCATCAAAAGATTGGA 
TCCTCTCAAAAAAG2\AAT-AAGAAATT-ACGATTACCAGCTGTATCATCAAAAGATTGGA 

ACTTGATTTTGGTCAATCGTGACCATAAACATGAAGAATTAAGTCCAGATGTGGTTCCTG 
ACTTGATTTTGGTCAATCGTGACCATAAACATGAAGAATTAAGTCCAGATGTGGTGCCTG 
ACTTGATTTTGGTCAATCGTGACCATAAACRTGAAGAATTAAGTCCAGATGTGGTTCCTG 
ACTTGATTTTGGTCAATCGTGACCATAAACATGAAGAATTAAGTCCAGATGTGGTTCCTG 
ACTTGATTTTGGTCAATCGTGACCATAAACATGAAGAATTAAGTCCAGATGTGGTGCCTG 
ACTTGATTTTGGTCAATCGTGACCATAAACATGAAGAATTAAGTCCAGATGTGGTTCCTG 
ACTTGATTTTGGTCAATCGTGACCATAAACATGAAGAATTAAGTCCAGATGTGGTGCCTG 
ACTTGATTTTGGTCAATCGTGACCATAAACATGAAGAATTAAGTCCAGATGTGGTGCCTG 
ACTTGATTTTGGTCAATCGTGACCATAAACATGAAGAATTAAGTCCAGATGTGGTTCCTG 
ACTTGATTTTGGTCAATCGTGACCATAAACATGAAGAATTAAGTCCAGATGTGGTGCCTG 
ACTTGATTTTGGTCAATCGTGACCATAAACATGAAGAATTAAGTCCAGATGTGGTGCCTG 

TTGAAAATATTTATTTGGATAAACGTATTACGAAGCAAGCTACTCAGTTTTTAGAGGCTG 
TTGAAAATATTTATTTGGATAAACGTATTACGAAGCAAGCTACTCAGTTTTTAGAGGCTG 
TTGAAAATATTTATTTGGATAAACGTATTACGAAGCAAGCTACTCAGTTTTTAGAGGCTG 
TTGAAAATATTTATTTGGATAAACGTATTACGAAGCAAGCTACTCAGTTTTTAGAGGCTG 
TTGAAAATATTTATTTGGATAAACGTATTACGAAGCAAGCTACTCAGTTTTTAGAGGCTG 
TTGAAAATATTTATOTGGATAAACGTATTACGAAGCAAGCTACTCAGTTTTTAGAGGCTG 
TTGAAAATATTTATTTGGATAAACGTATTACGAAGCAAGCTACTCAGTTTTTAGAGGCTG 
TTGAAAATATTTATTTGGATAAACGTATTACGAAGCAAGCTACTCAGTTTTTAGAGGCTG 
TTGAAAATATTTATTTGGATAAACGTATTACGAAGCAAGCTACTCAGTTTTTAGAGGCTG 
TTGAAAATATTTATTTGGATAAACGTATTACGAAGCAAGCTACTCAGTTTTTAGAGGCTG 
TTGAAAATATTTATTTGGATAAACGTATTACGAAGCAAGCTACTCAGTTTTTAGAGGCTG 

CTAGAGCAATTGATTCACGAGAACATTTAATTTCGGGTTATCGTAGTGTTGCCTATCAGG 
CTAGAGCAATTGATTCACGAGAACATTTAATTTCGGGTTATCGTAGTGTTGCCTATCAGG 
CTAGAGCAATTGATTCACGAGAACATTTAATTTCGGGTTATCGTAGTGTTGCCTATCAGG 
CTAGAGCAATTGATTCACGAGAACATTTAATTTCGGGTTATCGTAGTGTTGCCTATCAGG 
CTAGAGCAATTGATTC^CXSAGAACATTTAATTOCGGGTTATCGTAGTGTTGCCTATCAGG 
CTAGAGCAATTGATTCACGAGAACATTTAATTTCGGGTTATCGTAGTGTTGCCTATCAGG 
CTAGAGCAATTGATTCACGAGAACATTTAATTTCGGGTTATCGTAGTGTTGCCTATCAGG 
CTAGAGCAATTGATTCACGAGAACATTTAATTTCGGGTTATCGTAGTGTTGCCTATCAGG 
CTAGAGCAATTGATTCACGAGAACATTTAATTTCGGGTTATCGTAGTGTTGCCTATCAGG 
CTAGAGCAATTGATTCACGAGAACATTTAATTTCGGGTTATCGTAGTGTTGCCTATCAGG 
CTAGAGCAATTGATTCACGAGAACATTTAATTTCGGGTTATCGTAGTGTTGCCTATCAGG 

AGAAGTTGTTCAATTCTTATGTTACTCAAGAGATGACTAGTAACCCTAATTTGACGAGGG 
AGAAGTTGTTCAATTCTTATGTTACTCAAGAGATGACTAGTAACCCTAATTTGACGAGGG 
AGAAGTTGTTCAATTCTTATGTTACTCAAGAGATGACTAGTAACCCTAATTTGACGAGGG 
AGAAGTTGTTCAATTCTTATGTTACTCAAGAGATGACTAGTAACCCTAATTTGACGAGGG 
AGAAGTTGTTCAATTCTTATGTTACTCAAGAAATGACTAGTAACCCTAATTTGACGAAGG 
AGAAGTTGTTCAATTCTTATGTTACTCAAGAGATGACTAGTAACCCTAATTTGACGAGGG 
AGAAGTTGTTCAATTCTTATGTTACTCAAGAGATGACTAGTAACCCTAATTTGACGAGGG 
AGAAGTTGTTCAATTCTTATGTTACTCAWGAAATGACTAGTAACCCTAATTTGACGAAGG 
AGAAGTTGTTCAATTCTTATGTTACTCAAGAGATGACTAGTAACCCTAATTTGACGAGGG 
AGAAGTTGTTC2\ATTCTTATGTTACTCAAGAGATGACTAGTAACCCTAATTTGACGAGGG 
AGAAGTTGTTCAATTCTTATGTTACTCAAGAGATGACTAGTAACCCTAATTTGACGAGGG 
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Table 22: Comparative Sequences relating to SAG005 
(D-alanyl-D-alanine carboxypeptidase family protein) 



G0053 



SEQ2201 ACAAGCAGAAAAGTTGGTAAAAACTTACTCTCAGCCTGCAGGTGCTAGTGAACACCAGA 

SBQ2202 ACAAGCAGAAAAGTTGGTAAAAACTTACTCTCAGCCTGCAGGTGCTAGTGAACACCAGA 

SBQ2203 ACAAGCAGAAAAGTTGGTAAAAACTTACTCTCAGCCTGCAGGTGCTAGTGAACACCAGA 

SEQ2204 ACAAGCAGAAAAGTTGGTAAAAACTTACTCTCAGCCTGCAGGTGCTAGTGAACACCAGA 

SEQ2205 ACAAGCAGAAAAGTTGGTAAAAACTTACTCTCAGCCTGCAGGTGCTAGTGAACACCAGA 

SEQ2206 ACAAGCAGAAAAGTTGGTAAAAACTTACTCTCAGCCTGCAGGTGCTAGTGAACACCAGA 

SEQ2207 ACAAGCAGAAAAGTTGGTAAAAACTTACTCTCAGCCTGCAGGTGCTAGTGAACACCAGA 

SEQ2208 ACAAGCAGAAAAGTTGGTAAAAACTTACTCTCAGCCTGCAGGTGCTAGTGAACACCAGA 

SEQ220 9 ACAAGCAGAAAAGTTGGTAAAAACTTACTCTCAGCCTGCAGGTGCT AGTGAACACCAGA 

SEQ2210 ACAAGCAGAAAAGTTGGTAAAAACTTACTCTGAGCCTGCAGGTGCTAGTGAACACCAGA 

SEQ2211 ACAAGCAGAAAAGTTGGTAAAAACTTACTCTCAGCCTGCAGGTGCTAGTGAACACCAGA 

SEQ2201 CTGGATTAGCGATGGATATGAGTACTGTAGATTCTTTGAATGAGAGCGATCCTAGAGTAG 

SEQ2202 CTGGATTAGCGATGGATATGAGTACTGTAGATTCTTTGAATGAGAGCGATCCTAGAGTAG 

SEQ2203 CTGGATTAGCGATGGATATGAGTACTGTAGATTCTTTGAATGAGAGCGATCCTAGAGTAG 

SEQ2204 CTGGATTAGCGATGGATATGAGTACTGTAGATTCTTTGAATGAGAGCGATCCTAGAGTAG 

SEQ2205 CTGGATTAGCGATGGATATGAGTACTGTAGATTCTTTGAATGAGAGCGATCCTAGAGTAG 

SEQ2206 CTGGATTAGCGATGGATATGAGTACTGTAGATTCTTTGAATGAGAGCGATCCTAGAGTAG 

SEQ2207 CTGGATTAGCGATGGATATGAGTACTGTAGATTCTTTGAATGAGAGCGATCCTAGAGTAG 

SEQ2208 CTGGATTAGCGATGGATATGAGTACTGTAGATTCTTTGAATGAGAGCGATCCTAGAGTAG 

SEQ2209 CTGGATTAGCGATGGATATGAGTACTGTAGATTCTTTGAATGAGAGCGATCCTAGAGTAG 

SEQ2210 CTGGATTAGCGATGGATATGAGTACTGTAGATTCTTTGAATGAGAGCGATCCTAGAGTAG 

SEQ2211 CTGGATTAGCGATGGATATGAGTACTGTAGATTCTTTGAATGAGAGCGATCCTAGAGTAG 

SEQ2201 TCAGTCAGTTGAAAAAGATAGCTCCACAATATGGTTTTGTCTTACGGTTTCCGGATGGTA 

SEQ2202 TCAGTCAGTTGAAAAAGATAGCTCCACAATATGGTTTTGTCTTACGGTTTCCGGATGGTA 

SEQ2203 TCAGTCAGTTGAAAAAGATAGCTCCACAATATGGTTTTGTCTTACGGTTTCCGGATGGTA 

SEQ2204 TCAGTCAGTTGAAAAAGATAGCTCCACAATATGGTTTTGTCTTACGGTTTCCGGATGGTA 

SEQ2205 TCAGTCAGTTGAAAAAGATAGCTCCACAATATGGTTTTGTCTTACGGTTTCCGGATGGTA 

SEQ2207 TCAGTCAGTTGAAAAAGATAGCTCCACAATATGGTTTTGTCTTACGGTTTCCGGATGGTA 

SEQ2 20 B TCAGTCAGTTGAAAAAGATAGCTCCACAATATGGTTTTGTCTTACGGTTTCCGGATGGTA 

SEQ2209 TCAGTCAGTTGAAAAAGATAGCTCCAGAATATGGTTTTGTCTTACGGTTTCCGGATGGTA 

SEQ2210 TCAGTqAGTTGAAAAAGATAGCTCCACAATATGGTTTTGTCTTACGGTTTCCGGATGGTA 

SEQ2211 TCAGTCAGTTGAAAAAGATAGCTCCACAATATGGTTTTGTCTTACGGTTTCCGGATGGTA 

SEQ2201 AAACAGCAGAAACAGGGGTAGGTTATGAAGATTGGCATTACCGCTATGTTGGGGTAGAGT 

3EQ2202 AAACAGCAGAAACAGGGGTAGGTTATGAAGATTGGCATTACCGCTATGTTGGGGTAGAGT 

SEQ2203 AAACAGCAGAAACAGGGGTAGGTTATGAAGATTGGCATTACCGCTATGTTGGGGTAGAGT 

SEQ2204 AAACAGCAGAAACAGGGGTAGGTTATGAAGATTGGCATTACCGCTATGTTGGGGTAGAGT 

SEQ2205 AAACAGCAGAAACAGGGGTAGGTTATGAAGATTGGCATTACCGCTATGTTGGGGTAGAGT 

SEQ220 6 AAACAGCAGAliACAGGGGTAGGTTATGAAGATTGGCATTACCGCTATGTTGGGGTAGAGT 

SEQ2207 AAACAGCAGAAACAGGGGTAGGTTATGAAGATTGGCATTACCGCTATGTTGGGGTAGAGT 

SEQ2208 AAACAGCAGAAACAGGGGTAGGTTATGAAGATTGGCATTACCGCTATGTTGGGGTAGAGT 

SEQ2209 AAACAGCAGAAACAGGGGTAGGTTATGAAGATTGGCATTACCGCTATGTTGGGGTAGAGT 

SEQ221 0 AAACAGCAGAAACAGGGGTAGGTTATGAAGATTGGCATTACCGCTATGTTGGGGTAGAGT 

SEQ2211 AAACAGCAGAAACAGGGGTAGGTTATGAAGATTGGCATTACCGCTATGTTGGGGTAGAGT 

SEQ2201 CTGCAT^AATATATGGCCAAACATCATTTAACATTAGAAGAATACATAACTTTATTAAAGG 

SEQ2202 CTGO\AAATATATGGCCGAACATCGTTTAACATTAGAAGAATACATAACTTTATTAAAGG 

SEQ2203 CTGCAAAATATATGGCCAAACATCATTTAACATTAGAAGAATACATAACTTTATTAAAGG 

SEQ2204 CTGCAAAAT AT ATGG CC AAACATCATTTAACATT AGAAGAATACAT AACTTTAT T AAAGG 

SEQ2205 CTGCAAAATATATGGCCAAACATCATTTAACATTAGAAGAATACATAACTTTATTAAAGG 

SEQ2206 CTGCAAAATATATGGCCAAACATCATTTAACATTAGAAGAATACATAACTTTATTAAAGG 

SEQ2207 CTGCAAAATATATGGTCAAACATCATTT2VACATTAGAAGAATACATAACTTTATTAAAGG 

SEQ2208 CTGC^Uy^ATATATGGCCAAACATCATTTAACATTAGAAGAATACATAACTT'f ATTAAAGG 

SEQ2209 CTGCAAAATATATGGCGAAACATCATTTAACATTAGAAGAATACATAACTTTATTAAAGG 

SEQ2210 CTGCAAAATATATGGTCAAACATCATTTAACATTAGAAGAATACATAACTTTATT AAAGG 

SEQ2211 CTGCAAAATATATGGTCAAACATCATTTAACATTAGAAGAATACATAACTTTATTAAAGG 
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• 



SEQ2201 
SEQ2202 
SEQ2203 
SEQ2204 
SEQ2205 
SEQ2206 
SEQ2207 
SEQ2208 
SEQ2209 
SEQ2210 
SEQ2211 



50093 



Table 22: Comparative Sequences relating to SAGOO 1 
(D-alanyl-D-alanine carboxypeptidase family protein) 



AGAATAACCAA 

AGAATAACCAA 

AGAATAACCAA 

AGAATAACCAAAACCCAGCTTTCTTGTACAA 

AGAATAACCAA 

AGAATAACCAA 

AGAATAACCAAAACCCAGCTTTCTTGTACAA 

AGAATAACCAA 

AGAATAACCAA 

AGAATAACCAAAACCCAGCTTTCTT 

AGAATAACCAATABCMARATVSTNCSRATNGTSAGDAANYDAANNCARBXYTDASAMYRT 



>SEQ ID NO 2250: 18_090 frame: 1 

KPNSQQSSSQKLRNEDIKKISSQKRNKKLQLPAVSSKDWNLILVNRDHKHEELSPDWPV 
ENIYLDKRITKQATQFLEAARAIDSREHLISGYRSVAYQEKLFNSYVTQEMTSNPNLTRG 
QAEKLVKTYSQPAGASEHQTGLAMDMSTVDSLNESDPRWSQLKKIAPQYGFVLRFPDGK 
TAETGVGYEDWHYRYVGVESAKYMAKHHLTIiEEYITLLKENNQ 

>SEQ ID NO 2251: 18_1169NT frame: 1 

KPNS QQS S PQKLRNEDI KKIS SQKRNKKLRLPAVS SKDWN LI LVNR DHKHEELS PDWPV 
ENIYLDKRITKQATQFLEAARAIDSREHLISGYRSVAYQEKLFNSYVTQEMTSNPNLTRG 
QAEKLVKT YSQPAGASEHQTGLAMDMSTVDSLNE S DPRWSQLKKIAPQYGFVLRFPDGK 
TAETGVGYEDWHYRYVGVESAKYMAEHRLTLEEYITLLKENNQ 

>SEQ ID NO 2252: 18_18RS21 frame: 1 

KPNSQQSSSQKLRNEDIKKISSQKRNKKLQLPAVSSKDWNLILVNRDHKHEELSPDVVPV 
ENIYLDKRITKQATQFLEAARAIDSREHLISGYRSVAYQEKLFNSYVTQEMTSNPNLTRG 
QAEKLVKT YS QPAGASEHQTGLAMDMSTVDS LNESDPRWSQLKKIAPQYGFVLRFPDGK 
TAETGVGYEDWHYRYVGVESAKYMAKHHLTLEEYITLLKENNQ 

>SEQ ID NO 2253: 18_2603 frame: 3 

SQQS SSQKLRNEDIKKI S SQKRNKKLQLPAVS SKDWN LI LVNRDHKHEELS PDWPVENI 
YLDKRITKQATQFLEAARAI DSREHLISGYRS VAYQEKLFNS YVTQEMT SN PNLTRGQAE 
KLVKTYSQPAGASEHQTGLAMDMSTVDSLNESDPRWSQLKKIAPQYGFVLRFPDGKTAE 
TGVGYEDWHYRYVGVESAKYMAKHHLTLEEYITLLKENNQNPAFLY 

>SEQ ID NO 2254: 18_A909 frame: 1 

KPNSQQSS SQKLRNEDIKKTSSQKRNKKLRLPAVSSKDWNLILVNRDHKHEELS PDWPV 
EN I YLDKRITKQATQFLEAARAI DSREHL ISGYRSVAYQEKLFNSYVTQEMTSN PNLTKE 
QAEKLVKTYSQPAGASEHQTGLAMDMSTVDSLNESDPRWSQLKKIAPQYGFVLRFPDGK 
TAETGVGYEDWHYRYVGVESAKYMAKHHLTLEEYITLLKENNQ 

>SEQ ZD NO 2255:18_CJB110 frame: 1 
KPNSQQSSSQKLRNEDIKKISSQKRNKKFTITSCIIKRLELDFGQS 

>SEQ ID NO 2256:18jCOHl frame: 1 
PNSQQSSSQKLRNEDIKKTSSQKRN 

>SEQ ID NO 2257s 18_H36B frame: 1 

KPNSQQSS SQKLRNE DIKKTSSQKRNKKLRLPAVS SKDWN LI LVNRDHKHEELS PDWPV 
ENI YLDKRITKQATQFLEAARAI DSREHLI SGYRSVAYQEKL FN SYVTXEMTSN PNLTKE 
QAEKLVKTYSQPAGASEHQTGLAMDMSTVDSLNESDPRWSQLKKIAPQYGFVLRFPDGK 
TAETGVGYEDWHYRYVGVESAKYMAKHHLTLEEYITLLKENNQ 

>SEQ ID NO 2258: 18_OM9130013 frame: 1 

KPNSQQSSSQKLRNEDIKKISSQKRNKKLQLPAVSSKDWNLILVNRDHKHEELSPDWPV 
ENIYLDKRITKQATQFLEAARAIDSREHLISGYRSVAYQEKLFNSYVTQEMTSNPNLTRG 
QAEKLVKTYSQPAGASEHQTGLAMDMSTVDSLNESDPRWSQLKKIAPQYGFVLRFPDGK 
TAETGVGYEDWHYRYVGVESAKYMAKHHLTLEEYITLLKENNQ 

>SEQ ID NO 2259:X8_M732 frame: 3 

PNSQQSSSQKLRNEDIKKTSSQKRNKKLRLPAVSSKDWNLILVNRDHKHEELSPDWPVE 
NIYLDKRITKQATQFLEAARAIDSREHLISGYRSVAYQEKLFNSYVTQEMTSNPNLTRGQ 
AEKLVKTYSQPAGASEHQTGLAMDMSTVDSLNESDPRWSQLKKIAPQYGFVLRFPDGKT 
AETGVGYEDWHYRYVGVESAKYMVKHHLTLEEYITLLKENNQNPAF 
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Tabl 22: Comparative Sequences relating to SAGOO! 
(D-alanyl-D-alanine carboxypeptidase family protein) 

>SEQ ID NO 2260: 18 M781 frame: 1 

KPNSQQS S SQKLRNE DIKKTS SQKRNKKLRLE AVS SKDWN LILVNRDHKHEEL S PDWPV 
ENIYLDKRITKQATQFLEAARAIDpREHLISGYRSVAYQEKLFNSYVTQEMTSNPNLTRG 
QAEKLVKTYSQPAGASEHQTGLAMDMSTVDSLNESDPRWSQLKKIAPQYGFVLRFPDGK 
TAETGVGYEDWH YRYVGVES AKYMVKHHLTLEEY ITLLKENNQ 



SEQ2250 
SEQ2251 
SEQ2252 
SEQ2253 
SEQ2254 
SEQ2255 
SEQ2256 
SEQ2257 
SEQ2258 
SEQ2259 
SEQ2260 

SEQ2250 
SEQ2251 
SEQ2252 
SEQ2253 
SEQ2254 
SEQ2255 
SEQ2256 
SEQ2257 
SEQ2258 
SEQ2259 
SEQ2260 

SEQ2250 
SEQ2251 
SEQ2252 
SEQ2253 
SEQ2254 
SEQ2255 
SEQ2256 
SEQ2257 
SEQ2258 
SEQ2259 
SEQ2260 

SEQ2250 
SEQ2251 
SEQ2252 
SEQ2253 
SEQ2254 
SEQ2255 
SEQ2256 
SEQ2257 
SEQ2258 
SEQ2259 
SEQ2260 

SEQ2250 
SEQ2251 
SEQ2252 
SEQ2253 
SEQ2254 
SEQ2255 
SEQ2256 
SEQ2257 
SEQ2258 
SEQ2259 
SEQ2260 



PNSQQSSSQKLRNEDIKKISSQKRNKKU2LPAVSSKDWNLILVNRDHKHEELSPDWPV 
PNSQQSS PQKLRNEDIKKI S SQKRNKKLRLPAVS SKDWNLI LVNRDHKHEELS PDWPV 
PNSQQSSSQKLRNEDIKKISSQKRNKKLQLPAVSSKDWNLILVNRDHKHEELSPDWPV 
— SQQS S SQKLRNE DIKK I S SQKRNKKLQLPAVSSKDWNLI LVNRDHKHEELSPDWPV 
PNSQQSSSQKLRNEDIKKTS SQKRNKKLRLPAVSSKDWNLI LVNRDHKHEELSPDWPV 

PNSQQSSSQKLRNEDIKKISSQKRNKKFTITSCIIKRLEL DFGQS 

PNS QQS SS QKLRNE DIKKTS SQKRN 

PNSQQS S SQKLRNE DIKKTS SQKRNKKLRLPAVSSKDWNLILVNRDHKHEELS PDWPV 
PNS QQS S SQKLRNEDI KKI SSQKRNKKLQLPAVS SKDWNLI LVNRDHKHEELSPDWPV 
PNSQQS SS QKLRNE DIKKTS SQKRNKKLRLPAVSSKDWNL I LVNRDHKHEELSPDWPV 
PNSQQS S SQKLRNEDIKKTS SQKRNKKLRLPAVS SKDWNLI LVNRDHKHEELSPDWPV 

NIYLDKRITKQATQFLEAARAIDSREHLISGYRSVAYQEKLFNSYVTQEMTSNPNLTRG 
NIYLDKRITKQATQFLEAARAIDSREHLISGYRSVAYQEKLFNSYVTQEMTSNPNLTRG 
NIYLDKRITKQATQFLEAARAIDSREHLISGYRSVAYQEKLFNSYVTQEMTSNPNLTRG 
NIYLDKRITKQATQFLEAARAIDSREHLISGYRSVAYQEKLFNSYVTQEMTSNPNLTRG 
NIYLDKRITKQATQFLEAARAIDSREHLISGYRSVAYQEKLFNSYVTQEMTSNPNLTKE 



NIYLDKRITKQATQFLEAARAIDSREHLISGYRSVAYQEKLFNSYVTXEMTSNPNLTKE 
NI YLDKRITKQATQFLEAARAI DSREHLI S GYRSVAYQEKLFNS YVTQEMTSN PNLTRG 
NIYLDKRITKQATQFLEAARAIDSREHLISGYRSVAYQEKLFNSYVTQEMTSNPNLTRG 
NIYLDKRITKQATQFLEAARAIDSREHLISGYRSVAYQEKLFNSYVTQEMTSNPNLTRG 

AEKLVKTYSQPAGASEHQTGLAMDMSTVDSLNESDPRWSQLKKIAPQYGFVLRFPDGK 
AEKLVKTYSQPAGASEHQTGLAMDMSTVDSLNESDPRWSQLKKIAPQYGFVLRFPDGK 
AEKLVKTYSQPAGASEHQTGLAMDMSTVDSLNESDPRVVSQLKKIAPQYGFVLRFPDGK 
AEKLVKTYSQPAGASEHQTGLAMDMSTVDSLNESDPRWSQLKKIAPQYGFVLRFPDGK 
AEKLVKTYSQPAGASEHQTGLAMDMSTVDSLNESDPRWSQLKKIAPQYGFVLRFPDGK 



AEKLVKTYSQPAGASEHQTGLAMDMSTVDSLNESDPRWSQLKKIAPQYGFVLRFPDGK 
AEKLVKTYSQPAGASEHQTGLAMDMSTVDSLNESDPRWSQLKKIAPQYGFVLRFPDGK 
AEKLVKTYSQPAGASEHQTGLAMDMSTVDSLNESDPRWSQLKKIAPQYGFVLRFPDGK 
AEKLVKTYSQPAGASEHQTGLAMDMSTVDSLNESDPRWSQLKKIAPQYGFVLRFPDGK 



AETGVGYEDWHYRYVGVESAKYMAKHHLTLEEYITLLKENNQ 

AETGVGYEDWHYRYVGVESAKYMAEHRLTLEEYITLLKENNQ 

AETGVGYE DWHYRYVGVES AKYMAKHHLTLEEYITLLKENNQ 

AETGVGYEDWHYRYVGVE S AKYMAKHHLTLEE YITLLKENNQN PAFLY- 
AETGVGYEDWHYRYVGVESAKYMAKHHLTLEEYITLLKENNQ 



AETGVGYE DWHYRYVGVES AKYMAKHHLTIiEEY ITLLKENNQ 

AETGVGYE DWHYRYVGVES AKYMAKHHLTLEEYITLLKENNQ 

AETGVGYEDWHYRYVGVESAKYMVKHHLTLEEYITLLKENNQNPAF 

AETGVGYE DWHYRYVGVESAKYMVKHHLTLEEY ITLLKENNQTABLECMP ARATI VESE 



ENCESRELATINGTSAGDALANYLDALANINECARBXYPEPTIDASEFAMILYPRTEIN 
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Table 23: C mparative Sequences relating to SAG01 
(competence protein Cgl A) 





SEQ ID NO. 2301: SAGO 163 FROM THE 090 GBS TYPE III STRAIN (REVERSE COMPLEMENT) 

GGCAGTAGAAGTAAATGCTC^GATATTTATATCATO 
GTTTATTGATGTTTTTGAGTTTAATAGGATGGCTAGTCTT^^ 

ACGAAGTC^\ATTAGGTTCTTGTGACTATGAACTGTC^GAGGGAAGACTGGTTTCATTACGACTATCGAGTGTGGGAGATTATCGTGG 
TCAAGAATCTTTAGTTATTCGTATTTTGTATTCAGGTCATCAGGACTTAAAATATTGGTTTGATAATATAAAGCAAATGAAGGAAGT 
ACTGGGTACAAGAGGGCTATATCTTTTTTCCGGCCCTGTGGGGAGTGGTAAAA 

TAAAAATAAGCAAATTATCACGATTGAAGATCCGGTAGAAATCAAGAATGACAAGATGTTACAACTCCAATTGAATGAGGATATTGG 

AATGACTTATGATGCTTTAATGAAACTGTCTTTACGGCATCGTCC^GATATTTTAATTATCGG 

CCGTGCTGTTATTCGTGCAAGTTTAACGGGAGTGATGGTTTTTTCTACTATTCATGCT 

TATAGAATTAGGGGTTAACTATCAAGAGTTAGAAAATAGTCTAAAATTAATAGCATATCAACGTTTAATTGGAGGAGGAAGCCTAM 
TGACTTTGAGACAGGTAACTTTAAAAAACACTCATCAGACAAGTGGAATAGACAAGTGGATATCTTGGCTGAAGAAGGACATATCAG 
TAAGAAACAGGCAC AAGT CGAAAAAATTATCCCT C AAGAAACAACGGAAAGT AGT CCAACTTTT 

SEQ ID NO. 2302: SAG0163 FROM THE 1169NT1 GBS TYPE V STRAIN (REVERSE COMPLEMENT) 
GGTGATTGTTATGAAACCTCTACTATTGCGTATTTGATGA^ 

TTATTAGTCACTTTAAATTTGTGGCAGGCATGAACGTTGGAGAAAAAAGACGAAGTCAATTAGGTTCTTGTGACTATGAACTGTCAG 

AGGGAAGACTGGTTTCATTACGACTATCGAGTGTGGGAGATTATCGTGGTCAAGAATCTTTAGTTATTCGTATTTTGTATTCAGGTC 

ATCAGGACTTAAAATATTGGTTTGATAATATAAAGCAAATGAAGGAAGTACTGGGTACAAGAGGGCTATATCT 

TGGGGAGTGGTAAAACAACTCTCATGTATCAATTAGCTTC^GAAGTATTTAAAAATAAGCAAATTATCACGATTC 

AAATCAAGAATGACAAGATGTTACAACTCCAATTGAATGAGGATATTGGAATGACTTATGATGCT 

ATCGTCCAGATATTTTAATTATCGGAGAGATTAGAGATCAAGCGACGGCTCGTGCTGTTATTCGTGCAAGTTTAACGGGAGTGATGG 
TTTTTTCTACTATTCATGCTAAAAGTATTCCCGGAGTCTATGATAGGCTTATAGAATTAGGGGTTAACTATCAAGAGTTAGAAAATA 
GTCTAAAATTAATAGCATATCAACGTTTAATTGGAGGAGGAAGCCTAATTGACTTTGAGACAAGTAACTTTAAAAAACACTCATCAG 
ACAAGTGGAATAGACAAGTGGATATCTTGGCTGAAGAAGGATATATCAGTAAGAAACAGGCACAAGTCGAAAAAATTATCCCTCAAG 

AAACAACGGAAAGTAGTCCAACTTTT 

SEQ ID NO. 2303: SAG0163 FROM THE 18RS21 GBS TYPE II STRAIN (REVERSE COMPLEMENT) 

GTTCAATCATTAGCAAAGCAAGTCATTCATCAGGCAGTAGAAGTAAATGCTCAAGATATTTATA 

GAACTCTATATGCGTATTGATGATGAAAGGCGGTTTATTGATGTTTTTGAGTTTAATAGGATGGCTAGTCTTATTAGTCACTTTAAA 

TTTGTGGCAGGCATGAACGTTGGAGAAAAAAGACGAAGTCAATTAGGTTCTTGTGACTATGAACTGTCAGAG 

TTATOACTATCGAGTGTGGGAGATTATCGTGGTC^GAATCOTTAGTTATTCGTATTTTGTATTCAGGT 

TGGTTTGATAATATAAAGCAAATGAAGGAACTACTGGGTATAAGAGGGCTATATCTTTTTTCCGGCCCTGTGGG 

ACTCTCATGTATCAATTAGCTTCAGAAGTATTTAAAAATAAGCAAATTATCAC^ATTGAAGAT 

ATGTTAC^ACTCCAATTGAATGAGGATATTGGAATGACTTATGATGCTTTAATCAAACTGTCTTTACGGCATCGTCCAGATATTTTA 
ATTATCGGAGAGATTAGAGATCAAGCGACGGCCCGTGCTGTTATTCGTGCAAGTTTAACGGGAGTGATGGTTTTTTCTACTATTCAT 
GCTAAAAGTATTCX:CGGAGTCTATGATAGGCTTATAGAATTAGGGGTTAACTATCAAGAGTTAGAAAA 

TATCAACGTTTAATTGGAGGAGGAAGCCTAATTGACTTTGAGACAGGTAATTTTAAAAAACACTCATCAGACAAGTGGAATAGACAA 
GTGGATATCTTGGCTGAAGAAGGACATATCAGTAAGAAACAGGCACAAGTCGAAAAAATTATCCCTCAAGAM 



SEQ ID NO. 2304: SAG0163 FROM THE 2603 V/R GBS TYPE V STRAIN (REVERSE COMPLEMENT) 

GATATTTATATCATTCCCAAAGGTGATTGTTATGAACTCTATATGCGTATTGATGATGAAAGGCGGTTTA 
AATAGGATGGCTAGTCTTATTAGTCACTTTAAATTTGTGGCAGGCATGAACGTTGGAGAAAAAAGACGAM 

GACTATGAACTGTCAGAGGGAAGACTGGTTTCATTACGACTATCGAGTGTGGGAGATTATCGTGGTCAAGAATCTTTAGTTATTCGT 
ATTTTGTATTCAGGTCATCAGGACTTAAAATATTGGTTTGATAATATAAAGCAAATGAAGGAAGTACTGGGTATAAGAGGGCTATAT 

CTTTTTTCOGGCCCTGTGGGGAGTGGTAAAAGAACTCTCATGTATCA^ 

ATTGAAGATCC&GTAGAAATCAAGAATGAGAAGATGTTACAACT 

AAACTGTCTTTACGGCATCOTCCAGATATTTTAATTATCGGAGAGATT 

TTAACGGGAGTGATGGTTTTTTCTACTATTCATGCTAAAAGTATTC<X«GAGTCTATGATAGGCTTATA 

CAAGAGTTAGAAAATAGTCTAAAATTAATAGCATATCAACGTTTAATTGGAGGAGGAAGCCTAATTGACTTTGAGACAGGTAATTTT 

AAAAAAC^CTCATCAGACAAGrGGAATAGACAAGTGGATATCTTGGCTGAAGAAGGACATATCAGTA^ 

AAAAATTATCCCTCAAGAAACAACGGAAAGTAGTCCAACTTTT 



CCAACTTTT 
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Table 23: C mparative Sequences relating to SAG01 
(competence protein CglA) 

t~ «n K . SAC0163 FROM THE A909 GBS TYPE la STRAIN (REVERSE COMPLEMENT) 

CCAACTTTT 

™ „ rt ,w. SA60163 FROM THE CJB110 GBS NONTYPEABLB STRAIN (REVERSE COMPLEMENT) 

gt^StSttagI^^ 

GAACTCTATATGCGTATTGATGATfa^^ 



TTTGTGGCAGGCATGAACGTTGGAGAAAAAAGACGAAGTCAATTAGGTTCTTGTGACTATGAACTGTC^GAGGGAAG 

^^^^^^^^^^^^^ 

ATTATCGGAGAGA^ 



GCTAAAAGTATTTCCGGAGTCTATGAT 
TMCAACGTT^AATTGGAG 

CCAACTTTT 

«wo rn mo 2307- SAG0163 FROM THE COH1 GBS TYPE III STRAIN (REVERSE COMPLEMENT) 

^gtgattcttatcaaattS 
St^taaa^Sctct^ 

A^^TATTTTAATTATCGGAGAGATTAGAGATCAAGCGACGGCCCGTGCTGCT^^ 

totc^ac*SS^^ 
caacggaaagtagtccaactttt 

TTTT 



Table 23:^omparative Sequences relating to SAGO] 
(competence protein CglA) 





SEQ ZD NO. 2309: SAG0163 FROM THE JM91300X3 GBS TYPE VIII STRAIN (REVERSE COMPLEMENT) 
GTTCAATCATTAGCAAAGCAAGTCATTCATCAGGCAGTAGAAGTAAATGCT 
GAACTCTATATGCGTATTGATGATGAAAGGCGGTTTATTGATGTTTTTGAGTC 
TTTGTGGC^GGCATGAACGTTGGAGAAAAAAGACGAAGTCAATTAGGTTCTTCT 

TTACGACTATCGAGTGTGGGAGATTATCGTGGTCAAGAATCTTTAGTTATTCGTATTTTGTATTCAGGTCATCAGGACTTAAAATAT 
TGGTTTGATAATATAAAGCAAATGAAGGAAGTACTGGGTATAAGAGGGCTATATCTTTTTTCCGGCCCTGTGGGGAGTGGTAAAACA 
ACTCTCATGTATCAATTAGCTTCAGAAGTATTTAAAAATAAGCAAATTATCAC^ 

ATGTTACAACTCCAATTGAATGAGGATATTGGAATGACTTATGATGCTTTAATCAAACTGTCTTTACGGCATCGTCCAGATATTTT^ 
ATTATCGGAGAGATTAGAGATCAAGCGACGGCCCGTGCTGTTATTCGTGCAAGTTTAACGGGAGTGATGGTTTTTTCTACTATTCAT 
GCTAAAAGTATTCCCGGAGTCTATGATAGGCHTATAGAATTAGGGGTTAACT 

TATCAACGTTTAATTGGAGGAGGAAGCCTAATTGACTTTGAGACAGGTAATTTTAAAAAACACTCATCAGACAAGTGGAATAGACAA 

GTGGATATCTTGGCTGAAGAAGGACATATCAGTAAGAAACAGGCAC^^GTCGAAAAAATTATCCCTCAAGAAAC^^ 

CCAACTTTT 

SEQ ID NO. 2310: SAG0163 FROM THE M732 GBS TYPE III STRAIN (REVERSE COMPLEMENT) 

TGACTTGTTATGAAACTCTATATGCGTATTTGATGATGAAAAGGCGGTTTATTGATGTTTTTGAGTTTAATAGGATGGCTAGTCTTA 

TTAGTCACTTTAAATTTGTGGCAGGCATGAACGTTGGAGAAAAAAGACGAAGTCAATTAGGTTCTTGTGA 

GAAGACTGGPI^TCATTACGACTATCAAGTGTGGGAGATTATCGTGGTCAAGAATCTTTAGTTATTCGTACTTTGTATTCM 

AGGACTTAAAATATTGGTTTGATAATATAAAGTAAATGAAGGAAGTACTGTGTGCAAGAGGGCTATATCTTTTTTCCGGCCCTGTGG 

GGAGTGGTAAAAGAACTCTCATGTATCAATTAGCTTCAGAAGTATTTAAAAATAAGC 

TCAAGAATGACAAGATGTTACAACTCCAATTGAATGAGGATATTGGAATGACTTATGATGCTTTAATCAAACTGTCTTTACGGCATC 

GTCCAGATATTTTAATTATCGGAGAGATTAGAGATCAAGCGACGGCCCGTGCTGTTATTCGTGG^GTTTAACGGGAGTAATGGTTT 

TTTCTACTATTCATGCTAAAAGTATTCCCGGAGTCTATGATAGGCTTATAGAATTAGGGGTO 

TAAAATTAATAGCATATCAACGTTTAATTGGAGGAGGAAGCCTAATTGACTTTGAGACA^ 

AGTGGAATAGACAAGTGGATATCTTGGCTGAAGAAGGACATATCAGTAAGAAAC^ 

CAACGGAAAGTAGTCCAACTTTT 

SEQ ID NO. 2311: SAGO 163 FROM THE M781 GBS TYPE III STRAIN (REVERSE COMPLEMENT) 
CAGTAGAAGTAAATGCTCAAGATATTTATATCATTCCCAAAGGTGATTGTTATGAATTCTA^ 

TTATTGATGTTTTTGAGTTTAATAGGATGGCTAGTCTTATTAGTCACTTTAAATTTGTGGCAGGCATGAACGTTGGAGA 

GAAGTCAATTAGGTTCTTGTGACTATGAACTGTC^GAGGGAAGACTGGtf^ 

AAGAATCTOTAGTTATTCGTACTTTGTATTCAGGTCATCAGGACTTAAAATATTG 

TGTGTGCS^GAGGGCTATATCTTTTTTCCGGCCCTGTGGGGAGTGGTAAAACAACTCTCATGTATCAATTAGCTTCAGAAGTATTTA 
AAAATAAGCAAATTATC^CGATTGAAGATCCGGTAGAAATCtf^GAATGACAAGATGOT 

TGACTTATGATGCTTTAATCAAACTGTCTTTACGGCATCGTCCAGATATTTTAATTATCGGAGAGATTAGAGATCAAGCGACGGCCC 

GTGCTGTTATTCGTGC^GTTTAACGGGAGTAATGGTTTTTTCTACTATTCATGCTAA 

TAGAATTAGGGGTTAACTATC^GAGTTAGAAAATAGTCTAAAATTAATAG^^ 

ACTTTGAGACAAGTAACTTTAAAAAACACTCATC^GACAAGTGGAATAGACAAGTGGATATCTT^ 

AGAAACAGGCAC^^GTCGAAAAAATTATCCCTCAAGAAACAACGGAAAGTAGTCCAACTTTT 
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Table 23: Comparative Sequences relating to SAGOllV 
(competence protein CglA) 



SEQ2301 GGCAGTAGAAGTAAATGCTCAAGATATT 

SBQ2302 

SEQ2303 TTCAATCATTAGCAAAGCAAGTCATTCATCAGGCAGTAGAAGTAAATGCTCAAGATATT 

SBQ2304 GATATT 

SBQ2305 TTCAATCATTAGCAAAGCAAGTCATTCATCAGGCAGTAGAAGTAAATGCTCAAGATATT 

SEQ2306 TT CAAT CATTAG CAAAGC AAGTCATTCAT CAGGC AGT AGAAGTAAATGCT CAAG AT ATT 

SBQ2307 

SBQ230B TCATTAGCAAAGCAAGTCATTCATCAGGCAGTAGAAGTAAATGCTCAAGATATT 

SEQ2309 TTCAATCATTAGCy\AAGCAAGTCATTCATCAGGCAGTAGAAGTAAATGCTCAAGATATT 

SEQ2310 

SEQ2311 C AGT AGAAGTAAATGCT C AAGAT ATT 

SEQ2301 ATATCATTCCCAAAGGTGA-TTGTTATGAA-CTCTATA TGCGTATT-GATGATGA 

SEQ2302 GGTGA-TTGTTATGAA-ACCTCTACTATTGCGTATTTGATGATGA 

SEQ2303 ATATCATTCCCAAAGGTGA-TTGTTATGAA-CTCTATA TGCGTATT-GATGATGA 

SEQ2304 ATATCATTCCCAAAGGTGA-TTGTTATGAA-CTCTATA TGCGTATT-GATGATGA 

SEQ2305 ATATCATTCCCAAAGGTGA-TTGTTATGAA-CTCTATA TGCGTATT-GATGATGA 

SEQ230 6 AT ATCATTCCCAAAGGTGA-TTGTTATGAA- CTCTATA TGCGTATT-GATGATGA 

SEQ2307 AGGTGA-TTGTTATGAAATTCTATA TGCGTATT-GATGATGA 

SEQ2308 ATATCATTCCCAAAGGTGA-TTGTTATGAA-CTCTATA TGCGTATT-GATGATGA 

SEQ2309 ATATCATTCCCAAAGGTGA-TTGTTATGAA-CTCTATA TGCGTATT-GATGATGA 

SEQ2310 TGACTTGTTATGAAACTCTATA TGCGTATTTGATGATGA 

SEQ2311 ATATCATTCCCAAAGGTGA-TTGTTATGAA-TTCTATA TGCGTATT-GATGATGA 

SEQ2301 AA-GGCGGTTTATTGATGTTTTTGAGTTTAATAGGATGGCTAGTCTTATTAGTCACTTTA 

SEQ2302 AA-GGCGGTTTATTGATGTTTTTGAGTTTAATAGGATGGCTAGTCTTATTAGTCACTTTA 

SEQ2303 AA-GGCGGTTTATTGATGTTTTTGAGTTTAATAGGATGGCTAGTCTTATTAGTCACTTTA 

SEQ2304 AA-GGCGGTTTATTGATGTTTTTGAGTTTAATAGGATGGCTAGTCTTATTAGTCACTTTA 

SEQ2305 AA-GGCGGTTTATTGATGTTTTTGAGTTTAATAGGATGGCTAGTCTTATTAGTCACTTTA 
SEQ2306 AA— GGCGGTTTATTGATGTTTTTGAGTTTAATAGGATGGCTAGTCTTATTAGTCACTTTA 

SEQ2307 AA-GGCGGTTTATTGATGTTTTTGAGTTTAATAGGATGGCTAGTCTTATTAGTCACTTTA 
SEQ2308 AA-GGCGGTTTATTGATGTTTTTGAGTTTAATAGGATGGCTAGTCTTATTAGTCACTTTA 
SEQ2309 AA-GGCGGTTTATTGATGTTTTTGAGTTTAATAGGATGGCTAGTCTTATTAGTCACTTTA 
SEQ2310 AAAGGCGGTTTATTGATGTTTTTGAGTTTAATAGGATGGCTAGTCTTATTAGTCACTTTA 
SEQ2311 AA-GGCGGTTTATTGATGTTTTTGAGTTTAATAGGATGGCTAGTCTTATTAGTCACTTTA 

SEQ230X AATTTGTGGCAGGCATGAACGTTGGAGAAAAAAGACGAAGTCAATTAGGTTCTTGTGACT 
SEQ2302 AATTTGTGGCAGGCATGAACGTTGGAGAAAAAAGACGAAGTCAATTAGGTTCTTGTGACT 
SEQ2303 AATTTGTGGCAGGCATGAACGTTGGAGAAAAAAGACGAAGTCAATTAGGTTCTTGTGACT 
SEQ2304 AATTTGTGGCAGGCATGAACGTTGGAGAAAAAAGACGAAGTCAATTAGGTTCTTGTGACT 
SEQ2305 AATTTGTGGCAGGCATGAACGTTGGAGAAAAAAGACGAAGTCAATTAGGTTCTTGTGACT 
SEQ2306 AATTTGTGGCAGGCATGAACGTTGGAGAAAAAAGACGAAGTCAATTAGGTTCTTGTGACT 
SEQ2307 AATTTGTGGCAGGCATGAACGTTGGAGAAAAAAGACGAAGTCAATTAGGTTCTTGTGACT 
SEQ2308 AATTTGTGGCAGGCATGAACGTTGGAGAAAAAAGACGAAGTCAATTAGGTTCTTGTGACT 
SEQ230 9 AATTTGTGGCAGGCATGAACGTTGGAGAAAAAAGACGAAGTCAATTAGGTTCTTGTGACT 
SEQ2310 AATTTGTGGCAGGCATGAACGTTGGAGAAAAAAGACGAAGTCAATTAGGTTCTTGTGACT 
SEQ2311 AATTTGTGGCAGGCATGAACGTTGGAGAAAAAAGACGAAGTCAATTAGGTTCTTGTGACT 
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Tabl 23: Comparative Sequences relating t SAG016: 
(competence pr teinCglA) 





SEQ2301 
SEQ2302 
SEQ2303 
SEQ2304 
SEQ2305 
SEQ2306 
SEQ2307 
SEQ2308 
SEQ2309 
SEQ2310 
SEQ2311 

SEQ2301 
SEQ2302 
SEQ2303 
SEQ2304 
SEQ2305 
SEQ2306 
SEQ2307 
SEQ230B 
SEQ2309 
SEQ2310 
SEQ2311 

SEQ2301 
SEQ2302 
SEQ2303 
SEQ2304 
SEQ2305 
SEQ2306 
SEQ2307 
SEQ2308 
SEQ2309 
SEQ2310 
SEQ2311 

SEQ2301 
SEQ2302 
SEQ2303 
SEQ2304 
SEQ2305 
SEQ2306 
SEQ2307 
SEQ2308 
SEQ2309 
SEQ2310 
SEQ2311 



ATGAACTGTCAGAGGGAAGACTGGTTTCATTACGACTATOGAGTGTGGGAGATTATCGTG 
ATGAACTGTCAGAGGGAAGACTGGTTTCATTACGACTATCGAGTGTGGGAGATTATCGTG 
ATGAACTGTCAGAGGGAAGACTGGTTTCATTACGACTATCGAGTGTGGGAGATTATCGTG 
ATGAACTGTCAGAGGGAAGACTGGTTTCATTACGACTATCGAGTGTGGGAGATTATCGTG 
ATGAACTGTCAGAGGGAAGACTGGTTTCATTACGACTATCGAGTGTGGGAGATTATCGTG 
ATGAACTGTCAGAGGGAAGACTGGTTTCATTACGACTATCGAGTGTGGGAGATTATCGTG 
ATGAACTGTCAGAGGGAAGACTGGTTTCATTACGACTATCAAGTGTGGGAGATTATCGTG 
ATGAACTGTCAGAGGGAAGACTGGTTTCATTACGACTATCGAGTGTGGGAGATTATCGTG 
ATGAACTGTCAGAGGGAAGACTGGTTTCATTACGACTATCGAGTGTGGGAGATTATCGTG 
ATGAACTGTCAGAGGGAAGACTGGTTTCATTACGACTATCAAGTGTGGGAGATTATCGTG 
ATGAACTGTCAGAGGGAAGACTGGTTTCATTACGACTATCAAGTGTGGGAGATTATCGTG 

GTCAAGAATCTTTAGTTATTCGTATTTTGTATTCAGGTCATCAGGACTTAAAATATTGGT 
GTCAAGAATCTTTAGTTATTCGTATTTTGTATTCAGGTCATCAGGACTTAAAATATTGGT 
GTCAAGAATCTTTAGTTATTCGTATTTTGTATTCAGGTCATCAGGACTTAAAATATTGGT 
GTCAAGAATCTTTAGTTATTCGTATTTTGTATTCAGGTCATCAGGACTTAAAATATTGGT 
GTCAAGAATCTTTAGTTATTCGTATTTTGTATTCAGGTCATCAGGACTTAAAATATTGGT 
GTCAAGAATCTTTAGTTATTCGTATTTTGTATTCAGGTCATCAGGACTTAAAATATTGGT 
GTCAAGAATCTTTAGTTATTCGTACTTTGTATTCAGGTCATCAGGACTTAAAATATTGGT 
GTCAAGAATCTTTAGTTATTCGTATTTTGTATTCAGGTCATCAGGACTTAAAATATTGGT 
GTCAAGAATCTTTAGTTATTCGTATTTTGTATTCAGGTCATCAGGACTTAAAATATTGGT 
GTCAAGAATCTTTAGTTATTCGTACTTTGTATTCAGGTCATCAGGACTTAAAATATTGGT 
GTCAAGAATCTTTAGTTATTCGTACTTTGTATTCAGGTCATCAGGACTTAAAATATTGGT 

TTGATAATATAAAGCAAATGAAGGAAGTACTGGGTACAAGAGGGCTATATCTTTTTTCCG 
TTGATAATATAAAGCAAATGAAGGAAGTACTGGGTACAAGAGGGCTATATCTTTTTTCCG 
TTGATAATATAAAGCAAATGAAGGAAGTACTGGGTATAAGAGGGCTATATCTTTTTTCCG 
TTGATAATATAAAGCAAATGAAGGAAGTACTGGGTATAAGAGGGCTATATCTTTTTTCCG 
TTGATAATATAAAGCAAATGAAGGAAGTACTGGGTATAAGAGGGCTATATCTTTTTTCCG 
TTGATAATATAAAGGAAATGAAGGAAGTACTGGGTACAAGAGGGCTATATCTTTTTTCCG 
TTGATAATATAAAGTAAATGAAGGAAGTACTGTGTGCAAGAGGGCTATATCTTTTTTCCG 
TTGATAATATAAAGCAAATGAAGGAAGTACTGGGTATAAGAGGGCTATATCTTTTTTCCG 
TTGATAATATAAAGCAAATGAAGGAAGTACTGGGTATAAGAGGGCTATATCTTTTTTCCG 
TTGATAATATAAAGTAAATGAAGGAAGTACTGTGTGCAAGAGGGCTATATCTTTTTTCCG 
TTGATAATATAAAGCAAATGAAGGAAGTACTGTGTGCAAGAGGGCTATATCTTTTTTCCG 

GCCCTGTGGGGAGTGGTAAAACAACTCTCATGTATCAATTAGCTTCAGAAGTATTTAAAA 
GCCCTGTGGGGAGTGGTAAAACAACTCTCATGTATCAATTAGCTTCAGAAGTATTTAAAA 
GCCCTGTGGGGACTGGTAAAACAACTCTCATGTATCAATTAGCTTCAGAAGTATTTAAAA 
GCCCTGTGGGGAGTGGTAAAACAACTCTCATGTATCAATTAGCTTCAGAAGTATTTAAAA 
GCCCTGTGGGGAGTGGTAAAACAACTCTCATGTATCAATTAGCTTCAGAAGTATTTAAAA 
GCCCTGTGGGGAGTGGTAAAACAACTCTCATGTATCAATTAGCTTCAGAAGTATTTAAAA 
GCCCTGTGGGGAGTGGTAAAACAACTCTCATGTATCAATTAGCTTCAGAAGTATTTAAAA 
GCCCTGTGGGGAGTGGTAAAACAACTCTCATGTATCAATTAGCTTCAGAAGTATTTAAAA 
GCCCTGTGGGGAGTGGTAAAACAACTCTCATGTATCAATTAGCTTCAGAAGTATTTAAAA 
GCCCTGTGGGGAGTGGTAAAACAACTCTCATGTATCAATTAGCTTCAGAAGTATTTAAAA 
GCCCTGTGGGGAGTGGTAAAACAACTCTCATGTATCAATTAGCTTCAGAAGTATTTAAAA 
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Table 23: Comparative Sequences relating to SAG01 
(competenc protein CglA) 





SEQ2301 
SEQ2302 
SEQ2303 
SEQ2304 
SEQ2305 
SEQ2306 
SEQ2307 
SEQ2308 
SEQ2309 
SEQ2310 
SEQ2311 

SEQ2301 
SEQ2302 
SEQ2303 
SEQ2304 
SEQ2305 
SEQ2306 
SEQ2307 
SEQ2308 
SEQ2309 
SEQ2310 
SEQ2311 

SEQ2301 
SEQ2302 
SEQ2303 
SEQ2304 
SEQ2305 
SEQ2306 
SEQ2307 
SEQ2308 
SEQ2309 
SEQ2310 
SEQ2311 

SEQ2301 
SEQ2302 
SE02303 
SEQ2304 
SEQ2305 
SEQ2306 
SEQ2307 
SEQ2308 
SEQ2309 
SEQ2310 
SEQ2311 

SEQ2301 
SEQ2302 
SEQ2303 
SEQ2304 
SEQ2305 
SEQ2306 
SEQ2307 
SEQ2308 
SEQ2309 
SEQ2310 
SEQ2311 

' SEQ2301 
SEQ2302 
SEQ2303 
SEQ2304 
8EQ2305 
SEQ2306 
SEQ2307 



ATAAGCAAATTATCACGATTGAAGATCCGGTAGAAATCAAGAATGACAAGATGTTACAAC 
ATAAGCAAATTATCACGATTGAAGATCCGGTAGAAATCAAGAATGACAAGATGTTACAAC 
ATAAGCAAATTATCACGATTGAAGATCCGGTAGAAATCAAGAATGACAAGATGTTACAAC 
ATAAGCAAATTATCACGATTGAAGATCCGGTAGAAATCAAGAATGACAAGATGTTACAAC 
ATAAGCAAATTATCACGATTGAAGATCCGGTAGAAATCAAGAATGACAAGATGTTACAAC 
ATAAGCAAATTATCACGATTGAAGATCCGGTAGAAATCAAGAATGACAAGATGTTACAAC 
ATAAGCAAATTATCACGATTGAAGATCCGGTAGAZ\ATCAAGAATGACAAGATGTTACAAC 
ATAAGCAAATTATCACGATTGAAGATCCGGTAGAAATCAAGAATGACAAGATGTTACAAC 
ATAAGCAAATTATCACGATTGAAGATCCGGTAGAAATCAAGAATGACAAGATGTTACAAC 
ATAAGCAAATTATCACGATTGAAGATCCGGTAGAAATCAAGAATGACAAGATGTTACAAC 
ATAAGCAAATTATCACGATTGAAGATCCGGTAGAAATCAAGAATGACAAGATGTTACAAC 

TCCAATTGAATGAGGATATTGGAATGACTTATGATGCTTTAATCAAACTGTCTTTACGGC 
TCCAATTGAATGAGGATATTGGAATCACTTATGATGCTTTAATCAAACTGTCTTTACGGC 
TCCAATTGAATGAGGATATTGGAATGACTTATGATGCTTTAATCAAACTGTCTTTACGGC 
TCCAATTGAATGAGGATATTGGAATGACTTATGATGCTTTAATCAAACTGTCTTTACGGC 
TCCAATTGAATGAGGATATTGGAATGACTTATGATGCTTTAATCAAACTGTCTTTACGGC 
TCCAATTGAATGAGGATATTGGAATGACTTATGATGCTTTAATCAAACTGTCTTTACGGC 
TCCAATTGAATGAGGATATTGGAATGACTTATGATGCTTTAATCAAACTGTCTTTACGGC 
TCCAATTGAATGAGGATATTGGAATGACTTATGATGCTTTAATCAAACTGTCTTTACGGC 
TCCAATTGAATGAGGATATTGGAATGACTTATGATGCTTTAATCAAACTGTCTTTACGGC 
TCCAATTGAATGAGGATATTGGAATGACTTATGATGCTTTAATCAAACTGTCTTTACGGC 
TCCAATTGAATGAGGATATTGGAATGACTTATGATGCTTTAATCAAACTGTCTTTACGGC 

ATCGTCCAGATATTTTAATTATCGGAGAGAT-TAGAGATCAAGCGACGGCCCGTGCTGTT 
ATCGTCCAGATATTTTAATTATCGGAGAGAT— TAGAGATCAAGCGACGGCTCGTGCTGTT 
ATCGTCCAGATATTTTAATTATCGGAGAGAT-TAGAGATCAAGCGACGGCCCGTGCTGTT 
ATCGTCCAGATATTTTAATTATCGGAGAGAT-TAGAGATCAAGCGACGGCCCGTGCTGTT 
ATCGTCCAGATATTTTAATTATCGGAGAGAT—TAGAGATCAAGCGACGGCCCGTGCTGTT 
ATCGTCCAGATATTTTAATTATCGGAGAGAT«TAGAGATCAAGCGACGGCCCGTGCTGTT 
ATCGTCCAGATATTTTAATTATCGGAGAGAT-TAGAGATCAAGCGACGGCCCGTGCTGTT 
ATCGTCCAGATATTTTAATTATCGGAGAGAAATAGAGATCAAGCGACGGCCCGTGCTGTT 
ATCGTCCAGATATTTTAATTATCGGAGAGAT-TAGAGATCAAGCGACGGCCCGTGCTGTT 
ATCGTCCAGATATTTTAATTATCGGAGAGAT-TAGAGATCAAGCGACGGCCCGTGCTGTT 
ATCGTCCAGATATTXTAATTATCGGAGAGAT-TAGAGATCAAGCGACGGCCCGTGCTGTT 

ATTCGTGCAAGTTTAACGGGAGTGATGGTTTTTTCTACTATTCATGCTAAAAGTATTTCC 
ATTCGTGCAAGTTTAACGGGAGTGATGGTTTTTTCTACTATTCATGCTAAAAGTATTCCC 
ATTCGTGCAAGTTTAACGGGAGTGATGGTTTTTTCTACTATTCATGCTAAAAGTATTCCC 
ATTCGTGCAAGTTTAACGGGAGTGATGGTTTTTTCTACTATTCATGCTAAAAGTATTCCC 
ATTCGTGCAAGTTTAACGGGAGTGATGGTTTTTTCTACTATTCATGCTAAAAGTATTCCC 
ATTCGTGCAAGTTTAACGGGAGTGATGGTTTTTTCTACTATTCATGCTAAAAGTATTTCC 
ATTCX^GCAAGTTTAACGGGAGTAATGGTTTTTTCTACTATTCATGCTAAAAGTATTCCC 
ATTCGTGCAAGTTTAACGGGAGTGATGTTTTTTTCTACTATTCATGCTAAAAGTATTCCC 
ATTCGTGCAAGTTTAACGGGAGTGATGGTTTTTTCTACTATTCATGCTAAAAGTATTCCC 
ATTCGTGCAAGTTTAACGGGAGTAATGGTTTTTTCTACTATTCATGCTAAAAGTATTCCC 
ATTCGTGCAAGTTTAACGGGAGTAATGGTTTTTTCTACTATTCATGCTAAAAGTATTCCC 

GGAGTCTATGATAGGCTTATAGAATTAGGGGTTAACTATCAAGAGTTAGAAAATAGTCTA 
GGAGTCTATGATAGGCTTATAGAATTAGGGGTTAACTATCAAGAGTTAGAAAATAGTCTA 
GGAGTCTATGATAGGCTTATAGAATTAGGGGTTAACTATCAAGAGTTAGAAAATAGTCTA 
GGAGTCT ATGAT AGGCTTAT AGAATT AGGG GTT AACTAT CAAGAGTT AGAAAAT AGTCTA 
GGAGTCTATGATAGGCTTATAGAATTAGGGGTTAACTATCAAGAGTTAGAAAATAGTCTA 
GGAGTCTATGATAGGCTTATAGAATTAGGGGTTAACTATCAAGAGTTAGAAAATAGTCTA 
GGAGTCTATGATAGGCTTATAGAATTAGGGGTTAACTATCAAGAGTTAGAAAATAGTCTA 
GGAGTCTATGATAGGCTTATAGAATTAGGGGTTAACTATCAAGAGTTAGAAAATAGTCTA 
GGAGTCTATGATAGGCTTATAGAATTAGGGGTTAACTATCAAGAGTTAGAAAATAGTCTA 
GGAGTCTATGATAGGCTTATAGAATTAGGGGTTAACTATCAAGAGTTAGAAAATAGTCTA 
GGAGTCTATGATAGGCTTATAGAATTAGGGGTTAACTATCAAGAGTTAGAAAATAGTCTA 

AAATTAATAGCATATCAACGTTTAATTGGAGGAGGAAGCCTAATTGACTTTGAGACAGGT 
AAATTAATAGCATATCAACGTTTAATTGGAGGAGGAAGCCTAATTGACTTTGAGACAAGT 
AAATTAATAGCATATCAACGTTTAATTGGAGGAGGAAGCCTAATTGACTTTGAGACAGGT 
AAATTAATAGCATATCAACGTTTAATTGGAGGAGGAAGCCTAATTGACTTTGAGACAGGT 
AAATTAATAGCATATCAACGTTTAATTGGAGGAGGAAGCCTAATTGACTTTGAGACAGGT 
AAATTAATAGCATATCAACGTTTAATTGGAGGAGGAAGCCTAATTGACTTTGAGACAGGT 
AAATTAATAGCATATCAACGTTTAATTGGAGGAGGAAGCCTAATTGACTTTGAGACAAGT 
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SEQ2308 
SEQ2309 
SEQ2310 
SEQ2311 

SEQ2301 
SEQ2302 
SEQ2303 
SEQ2304 
SEQ2305 
SEQ2306 
SEQ2307 
SEQ2308 
SEQ2309 
SEQ2310 
SEQ2311 

SEQ2301 
SEQ2302 
SEQ2303 
SEQ2304 
SEQ2305 
SEQ2306 
SEQ2307 
SEQ2308 
SEQ2309 
SEQ2310 
SEQ2311 

SEQ2301 
SEQ2302 
SEQ2303 
SEQ2304 
SEQ2305 
SEQ2306 
SEQ2307 
SEQ2308 
SEQ2309 
SEQ2310 
SEQ2311 



& <J *± 12 £t C - C5 if ic? ^ SJH i 

Table 23: Comparative Sequences relating t SAGOl^^ 
(competence protein Cgl A) 

AAATTAATAGCATATCAACGTTTAATTGGAGGAGGAAGCCTAATTGACTTTGAGACAGGT 
AAATTAATAGCATATCAACGTTTAATTGGAGGAGGAAGCCTAATTGACTTTGAGACAGGT 
AAATTAATAGCATATCAACGTTTAATTGGAGGAGGAAGCCTAATTGACTTTGAGACA2VGT 
AAATTAATAGCATATCAAC<3TTTAATTGGAGGAGGAAGCCTAATTGACTTTGAGACZ^AGT 

AACTTTAAAAAACACTCATCAGACAAGTGGAATAGACAAGTGGATATCTTGGCTGAAGAA 
AACTTTAAAAAACACTCATCAGACAAGTGGAATAGACAAGTGGATATCTTGGCTGAAGAA 
AATTTTAAAAAACACTCATCAGACAAGTGGAATAGACAAGTGGATATCTTGGCTGAAGAA 
AATTTTAAAAAACACTCATCAGACAAGTGGAATAGACAAGTGGATATCTTGGCTGAAGAA 
AATTTTAAAAAACACTCATCAGACAAGTGGAATAGACAAGTGGATATCTTGGCTGAAGAA 
AACTTTAAAAAACACTCATCAGACAAGTGGAATAGACAAGTGGATATCTTGGCTGAAGAA 
AACTTTAAAJ\AACACTCATCAGACAAGTGGAATAGACAAGTGGATATCTTGGCTGAAGAA 
AATTTTAAAAAACACTCATCAGACAAGTGGAATAGACAAGTGGATATCTTGGCTGAAGAA 
AATTTTAAAAAACACTCATCAGACAAGTGGAATAGAC7\AGTGGATATCTTGGCTGAAGAA 
AACTTTAAAAAACACTCATCAGACAAGTGGAATAGACAAGTGGATATCTTGGCTGAAGAA 
AACTTTAAAAAACACTCATCAGACAAGTGGAATAGACAAGTGGATATCTTGGCTGAAGAA 

GGACATATCAGTAAGAAACAGGCACAAGT-CGAAAAAATTATCCCTCT^VGAAACAACGGA 
GGATATATCAGTAAGAAACAGGCACAAGT-CGAAAAAATTATCCCTCAAGAAACAACGGA 
GGACATATCAGTAAGAAACAGGCACAAGT-CGAAAAAATTATCCCTCAAGAAACAACGGA 
GGACATATCAGTAAGAAACAGGCACAAGTGCGAAAAAATTATCCCTCAAGAAACAACGGA 
GGACATATCAGTAAGAAACAGGCACAAGT- CGAAAAAATTAT CCCTCAAGAAACAACGGA 
GGACATATCAGTAAGAAACAGGCACAAGT-CGAAAAAATTATCCCTCAAGAAACAACGGA 
GGACATATCAGTAAGAAACAGGCACAAGT-CGAAAAAATTATCCCTCAAGAAACAACGGA 
GGAC ATAT CAGTAAGAAAC AGGCAGAAGT- CGAAAAAATT ATCCCTCAAGAAAC AACGG A 
GGACATATCAGTAAGAAACAGGCACAAGT-CGAAAAAATTATCCCTCAAGAAACAACGGA 
GGACATATCAGTAAGAAACAGGCACAAGT-CGAAAAAATTATCCCTCAAGAAACAACGGA 
GGACATATCAGTAAGAAACAGGCACAAGT-CGAAAAAATTATCCCTCAAGAAACAACGGA 

AAGTAGTCCAACTTTT 

AAGTAGTCCAACTTTT 

AAGTAGTCCAACTTTT " " 

AAGTAGTCCAACTTTT 

AAGTAGTCCAACTTTT 

AAGTAGTCCAACTTTT 

AAGTAGTCCAACTTTT 

AAGTAGTCCAACTTTT 

AAGTAGTCCAACTTTT 

AAGTAGTCCAACTTTT 

AAGTAGTCCAACTTTT 



>SEQ ID MO 2350 :63_ 090 frame: 2 

AVEVNAQDI YI IPKGDCYELYMRI DDERRFI DVFEFNRMASLI SHFKFVAGMNVGEKRRS 
QLGSCDYELSEGRLVSLRLSSVGDYRGQESLVIRILYSGHQDLKYWFDNIKQMKEVLGTR 
GLYLFSGPVGSGKTTLMYQLASEVFKNKQIITIEDPVEIKNDKMLQLQLNEDIGMTYDAL 
IKLSLRHRPDIIiIIGEIRDQATARAVIRASLTGVMVFSTIHAKSISGVYDRLIELGVNYQ 
EIiENSLKLIAYQRLIGGGSLI DFETGNFKKHSSDKWNRQVDILAEEGHISKKQAQVEKI I 
PQETTESSPTF 

>SEQ ID NO 235I:63_H69NT frame: 3 

. LL . NLYYCVFDDERRFIDVFEFNRMASLISHFKFVAGMN VGEKRRSQLGSCDYELSEGR 
LVSLRLSSVGDYRGQESLVIRILYSGHQDLKYWFDNIKQMKEVLGTRGIiYXFSGPVGSGK 
TTLMYQIJUSEVFKNKQIITIEDPVEIKNDKMLQLQLNEDIGMTYDALIKLSLRHRPDILI 
IGEIRDQATARAVIRASLTGVMVFSTIHAKSIPGVYDRLIELGVNYQELENSLKLIAYQR 
LIGGGSLIDFETSNFKKHSSDKWNRQVDILAEEGYISKKQAQVEKIIPQETTESSPTF 

>SEQ ID NO 2352:63_18RS21 frame: 1 

VQSLAKQVIHQAVEVNAQDIYIIPKGDCYELYMRIDDERRFIDVFEFNRMASLISHFKFV 
A<3^WGEKRRSQLGSCDYELSEGRLVSLRLSSVGDYRGQESLVIRILYSGHQDLKYWFDN 
IKQMKEVLGIRGLYLFSGPVGSGKTTIJ4YQLASEVFKNKQIITIEDPVEIKNDKMLQLQL 
NEDIGMTYDALIKLSLRHRPDILIIGEIRDQATARAVIRASLTGVMVFSTIHAKSIPGVY 
DRIiIEIXSVNYQELENSLKLIAYQRLIGGGSLIDFETGNFKKHSSDKWNRQVDILAEEGHI 
SKKQAQVEKIIPQETTESSPTF 



>SEQ ID NO 2353: 63_2603 frame: 1 

DIYI I PKGDCYELYMRIDDERRFIDVFEFNRMASLISHFKFVAGMNVGEKRRSQLGSCDY 
ELSEGRLVSLRLSSVGDYRGQESLVlftlLYSGHQDLKYWFDNIKQMKEVLGIRGLYLFSG 
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Table 23: C mparative Sequences relating t SAG01* 
(competence pr teinCglA) 





PVGSGKTTLMYQLASEVFKNKQIITIEDPVEIKNDECMLQLQLNEDIGMTYDALIKLSLRH 
RPDILIIGEIRDQATARAVIRASLTGVMVFSTIHAKSIPGVYDRLIELGVNYQELENSLK 
LI AYQRLIGGGSL I DFETGN FKKHS S DKWNRQVDI LAEEGHI SKKQAQVRKN YP SRNNGK 
. SNF 

>SEQ ZD NO 2354:63_A909 frame: 1 

VQSIAKQVIHQAVEVNAQDIYIIPKGDCYELYMRIDDERRFIDVFEFNRMASLISHFKFV 
AGMNVGEKRRSQLGSCDYELSEGRLVSLRLSSVGDYRGQESLVIRILYSGHQDLKYWFDN 
IKQMKEVLGIRGLYLFSGPVGSGKTTLMYQLASEVFKNKQI ITIEDPVE IKNDKMLQLQL 
NEDIGMTYDALIKLSLRHRPDILIIGEIRDQATARAVIRASLTGVMVFSTIHAKSIPGVY 
DRLIELGVNYQELENSLKLIAYQRLIGGGSLI DFETGN FKKHSSDKWNRQVDI LAEEGHI 
SKKQAQVEKI I PQETTESS PT F 

>SEQ ID NO 2355:63jCJB110 frame: 1 

VQSLAKQVIHQAVEVNAQDI YI I PKGDCYEL YMRI DDERRFI DVFEFNRMASLI SHFKFV 
AGMNVGEKRRSQLGSCDYE LSEGRLVSLRLS SVGDYRGQESLVIRILYS GHQDLKYW FDN 
IKQMKEVLGTRGLYLFSGPVGSGKTTLMYQLASEVFKNKQI IT IEDPVE IKN DKMLQLQL 
NEDIGMTYDALIKLSLRHRPDILIIGEIRDQATARAVIRASLTGVMVFSTIHAKSISGVY 
DRLIELGVNYQELENSLKLIAYQRLIGGGSLIDFETGNFKKHSSDKWNRQVDILAEEGHI 
SKKQAQVEKI I PQETTE S S PT F 

>SEQ ID NO 2356:63_CJB110 frame: 1 

VQSLAKQVIHQAVEVNAQDIYI I PKGDCYELYMRI DDERRFI DVFEFNRMASLI SHFKFV 

AGMNVGEKRRSQLGSCDYELSEGRLVSLRLSSVGDYRGQESLVIRILYSGHQDLKYWFDN 

IKQMKEVLGTRGLYLFSGPVGSGKTTLMYQLASEVFKNKQIITIEDPVEIKNDKMLQLQL 

NEDIGMTYDALIKLSLRHRPDILIIGEIRDQATARAVIRASLTGVMVFSTIHAKSISGVY 

DRLIELGVNYQEIiENSLKLIAYQRLIGGGSLIDFETGNFKKHSSDKWNRQVDILAEEGHI 

SKKQAQVEKI I PQETTE S S PTF 

>SEQ ID NO 2357: 63_H36B frame: 1 

SLAKQVIHQAVEVNAQDI YI I PKGDCYELYMRI DDERRFI DVFEFN RMASLI SHFKFVAG 
MNVGEKRRSQLGSCDYELSEGRLVSLRLSSVGDYRGQESLVIRILYSGHQDLKYWFDNIK 
QMKEVLGIRGLYLFSGPVGSGKTTLMYQLASEVFKNKQIITIEDPVEIKNDKMLQLQLNE 
DIGMTYDALIKLSLRHRPDILIIGEK 

>SEQ ID NO 2358 : 63_JM9130013 frame: X 

VQSLAKQVIHQAVEWAQDIYI I PKGDCYELYMRI DDERRFI DVFEFNRMASLI SHFKFV 
AGMNVGEKRRSQLGSCDYELSEGRLVSLRLSSVGDYRGQESLVIRILYSGHQDLKYWFDN 
IKQMKEVLGIRGLYLFSGPVGSGKTTLMYQLASEVFKNKQI IT IEDPVE IKNDKMLQLQL 
NEDIGMTYDALIKLSLRHRPDILI IGEIRDQATARAVIRASLTGVMVFSTIHAKS IPGVY 
DRLIELGVNYQELENSLKLIAYQRLIGGGSL I DFETGN FKKHSSDKWNRQVDI LAEEGHI 
SKKQAQVEKI I PQETTE S S PT F 

>SEQ ID NO 2359:63_M732 frame: 3 

TCYETLYAYLMMKRRFIDVFEFNRMASLISHFKFVAGMNVGEKRRSQLGSCDYELSEGRL 
VSLRLS SVGDYRGQE SLVI RTLYSGHQDLKYWFDN IK . MKEVLCARGLYLFSGPVGSGKT 
TLMYQLASEVFKNKQIITIEDPVEIKNDKMLQLQLNEDIGMTYDALIKLSLRHRPDILII 
GEIRDQATARAVI RASLTGVMVFST IHAKS I PGVYDRLIEIjGVNYQELENSLKLIAYQRL 
IGGGSLIDFETSNFKKHSSDKWNRQVDILAEEGHISKKQAQVEKIIPQETTESSPTF 

>SEQ ID NO 2360;63_M781 frame: 3 

VEVNAQDIYIIPKGDCYEFYMRIDDERRFIDVFEFNRMASLISHFKFVAGMNVGEKRRSQ 
LG5 CDYELSEGRLVSLRLS SVGDYRGQESLVIRTLYSGHQDLKYWFDN IKQMKEVLCARG 
WLFSGPVGSGKTTLMYQLASEVFKNKQIITIEDPVEIKNDKMLQLQLNEDIGMTYDALI 
KLSLRHRPDILIIGEIRDQATARAVIRASLTGVMVFSTIHAKSIPGVYDRLIELGVNYQE 
LENSLKLIAYQRLIGGGSLIDFETSNFECKHSSDKWNRQVDILAEEGHISKKQAQVEKIIP 
QETTESSPTF 

>SEQ ID NO 2361:63_COHl frame: 3 

VIVMKFYMRIDDERRFIDVFEFNRMASLISHFKFVAGMNVGEKRRSQLGSCDYELSEGRL 
VSLRLSSVGDYRGQESLVIRTLYSGHQDLKYWFDNIK 
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Table 23: Comparative Sequences relating t SAG01 
(competence protein CglA) 



t 



SEQ2350 
SEQ2351 
SEQ2352 
SEQ2353 
SEQ2354 
SEQ2355 
SEQ2356 
SEQ2357 
SEQ2358 
SEQ2359 
SEQ2360 
SEQ2361 

SEQ2350 
SEQ2351 
SEQ2352 
SEQ2353 
SEQ2354 
SEQ2355 
SEQ2356 
SEQ23S7 
SEQ2358 
SEQ2359 
SEQ2360 
SEQ2361 

SEQ2350 
SEQ2351 
SEQ2352 
SEQ2353 
SEQ2354 
SEQ2355 
SEQ2356 
SEQ2357 
SEQ2358 
SEQ2359 
SEQ2360 
SEQ2361 

SEQ2350 
SEQ2351 
SEQ2352 
SEQ2353 
SEQ2354 
SEQ2355 
SEQ2356 
SEQ2357 
SEQ2358 
SEQ2359 
SEQ2360 
SEQ2361 

SEQ2350 
SEQ2351 
SEQ2352 
SEQ2353 
SEQ2354 
SEQ2355 
SEQ2356 
SEQ2357 
SEQ2358 
SEQ2359 
SEQ2360 
SEQ2361 



AVEVNAQDIYIIPKGDCYELYMRIDDERRFIDVFEFNRMASLISHFKFV 

LLNLYYCVFDDERRFIDVFEFNRMASLISHFKFV 

QSLAKQVIHQAVEVNAQDIYIIPKGDCYELYMRIDDERRFIDVFEFNRMASLISHFKFV 

DIYIIPKGDCYELYMRIDDERRFIDVFEFNRMASLISHFKFV 

QSLAKQVIHQAVEVNAQDIYIIPKGDCYELYMRIDDERRFIDVFEFNRMASLISHFKFV 
QSLAKQVIHQAVEVNAQDI YI I PKGDCYELYMRI DDERRFI DV FEFNRMASLI SHFKFV 
QSLAKQVIHQAVEVNAQDIYIIPKGDCYELYMRIDDERRFIDVFEFNRMASLISHFKFV 
-SLAKQVIHQAVEVNAQDIYIIPKGDCYELYMRIDDERRFIDVFEFNRMASLISHFKFV 
QSLAKQVI HQAVEVNAQDI YI I PKGDCYELYMRI DDERRFI DVFEFNRMASLI SH FKFV 

TCYETLYAYLMMKRRFIDVFEFNRMASLISHFKFV 

VEVNAQDIYI I PKGDCYEFYMRI DDERRFI DVFEFNRMASLI SHFKFV 

VIVMKFYMRI DDERRFI DVFEFNRMASLI SHFKFV 

AGMNVGEKRRSQLGSCDYELSEGRLVSLRLSSVGDYRGQESLVIRILYSGHQDLKYWFDN 
AGMNVGEKRRSQLGSCDYELSEGRLVSLRLSSVGDYRGQESLVIRILYSGHQDLKYWFDN 
AGMNVGEKRRSQLGS CDYELSEGRLVS LRLS S VGDYRGQESLVIRILYSGHQDLKYWFDN 
AGMNVGEKRRSQLGSCDYELSEGRLVSLRLSSVGDYRGQESLVIRILYSGHQDLKYWFDN 
AGMNVGEKRRSQLGSCDYELSEGRLVS LRLS SVGDYRGQESLVIRI LYSGHQDLKYWFDN 
AGMNVGEKRRSQLGSCDYELSEGRLVSLRLSSVGDYRGQESLVIRILYSGHQDLKYWFDN 
AGMNVGEKRRSQLGSCDYELSEGRLVSLRLSSVGDYRGQESLVIRILYSGHQDLKYWFDN 
AGMNVGEKRRSQLGSCDYELSEGRLVS LRL S SVGDYRGQESLVI RIL YSGHQDLKYW FDN 
AGMNVGEKRRSQLGSCDYELSEGRLVSLRLSSVGDYRGQESLVIRILYSGHQDLKYWFDN 
AGMNVGEKRRSQLGSCDYELSEGRLVSLRLSSVGDYRGQESLVIRTLYSGHQDLKYWFDN 
AGMNVGEKRRSQLGSCDYELSEGRLVS LRLS SVGDYRGQE SLVI RTLYSGHQDLKYW FDN 
AGMNVGEKRRSQLGSCDYELSEGRLVSLRLSSVGDYRGQESLVIRTLYSGHQDLKYWFDN 

IKQMKEVLGTRGLYLFSGPVGSGKTTLMYQLASEVFKNKQIITIEDPVEIKNDKMLQLQL 
IKQMKEVLGTRGLYLFSGPVGSGKTTLMYQLASEVFKNKQIITIEDPVEIKNDKMLQLQL 
IKQMKEVLGIRGLYLFSGPVGSGKTTLMYQLASEVFKNKQIITIEDPVEIKNDKMLQLQL 
IKQMKEVLGIRGLYLFSGPVGSGKTTLMYQLASEVFKNKQIITIEDPVEIKNDKMLQLQL 
IKQMKEVLGIRGLYLFSGPVGSGKTT LMYQLASEV FKNKQIIT I EDPVE I KN DKMLQLQL 
IKQMKEVLGTRGLYLFSGPVGSGKTTLMYQLASEVFECNKQIITIEDPVEIKNDKMLQLQL 
IKQMKEVLGTRGLYLFSGPVGSGKTTLMYQLASEVFKNKQIITIEDPVEIKNDKMLQLQL 
IKQMKEVLGIRGLYLFSGPVGSGKTTLMYQLASEVFKNKQIITIEDPVEIKNDKMLQLQL 
IKQMKEVLGIRGLYLFSGPVGSGKTTLMYQLASEVFKNKQIITIEDPVEIKNDKMLQLQL 
IK-MKEVLCARGLYLFSGPVGSGKTTLMYQLASEVFKNKQIITIEDPVEIKNDKMLQLQL 
IKQMKEVLCARGLYLFSGPVGSGKTTLMYQLASEVFKNKQI IT I EDPVE I KN DKMLQLQL 
IK 

EDIGMTYDALIKLSLRHRPDILIIGEIRDQATARAVIRASLTGVMVFSTIHAKSISGVY 
EDIGMTYDALIKLSLRHRPDILIIGEIRDQATARAVIRASLTGVMVFSTIHAKSIPGVY 
EDIGMTYDALIKLSLRHRPDILIIGEIRDQATARAVIRASLTGVMVFSTIHAKSIPGVY 
EDIGMTYDALIKLSLRHRPDILIIGEIRDQATARAVIRASLTGVMVFSTIHAKSIPGVY 
EDIGMTYDALIKLSLRHRPDILIIGEIRDQATARAVIRASLTGVMVFSTIHAKSIPGVY 
EDIGMTYDALIKLSLRHRPDILIIGEIRDQATARAVIRASLTGVMVFSTIHAKSISGVY 
EDIGMTYDALIKLSLRHRPDILIIGEIRDQATARAVIRASLTGVMVFSTIHAKSISGVY 

EDIGMTYDALIKLSLRHRPDILIIGEK 

EDIGMTYDALIKLSLRHRPDILIIGEIRDQATARAVIRASLTGVMVFSTIHAKSIPGVY 
EDIGMTYDALIKLSLRHRPDILIIGEIRDQATARAVIRASLTGVMVFSTIHAKSIPGVY 
EDIGMTYDALIKLSLRHRPDILIIGEIRDQATARAVIRASLTGVMVFSTIHAKSIPGVY 



RLIELGVNYQELENSLKLIAYQRLIGGGSLIDFETGNFKKHSSDKWNRQVDILAEEGHI 
RLIELGVNYQELENSLKLIAYQRLIGGGSLIDFETSNFKKHSSDKWNRQVDILAEEGYI 
RLIELGVNYQELENSLKLIAYQRLIGGGSLIDFETGNFKKHSSDKWNRQVDILAEEGHI 
RLIELGVNYQELENSLKLIAYQRLIGGGSLIDFETGNFKKHSSDKWNRQVDILAEEGHI 
RLIELGVNYQELENSLKLIAYQRLIGGGSLIDFETGNFKKHSSDKWNRQVDILAEEGHI 
RLIELGVNYQELENSLKLIAYQRLIGGGSLIDFETGNFKKHSSDKWNRQVDILAEEGHI 
RLIELGVNYQELENSLKLIAYQRLIGGGSLIDFETGNFKKHSSDKWNRQVDILAEEGHI 

RLIELGVNYQELENSLKLIAYQRLIGGGSLIDFETGNFKKHSSDKWNRQVDILAEEGHI 
RLIELGVNYQELENSLKLIAYQRLIGGGSLIDFETSNFKKHSSDKWNRQVDILAEEGHI 
RLIELGVNYQELENSLKLIAYQRLIGGGSLIDFETSNFKKHSSDKWNRQVDILAEEGHI 
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Table 23: C mparative Sequences relating t SAG01 
(competence protein CglA) 



SEQ2350 
SEQ2351 
SEQ2352 
SEQ2353 
SEQ2354 
SEQ235S 
SEQ2356 
SEQ2357 
SEQ2358 
SEQ2359 
SEQ2360 
SEQ2361 



KKQAQVEKI I PQETTESSPTF 
KKQAQVEKI I PQETTESSPTF 
KKQAQVEKI I PQETTES S PT F 
KKQAQVRKNYPSRNNGKSNF- 
KKQAQVEKI I PQETTES SPTF 
KKQAQVEKI I PQETTES S PT F 
KKQAQVEKI I PQETTES SPTF 

KKQAQVEKI I PQETTES S PTF 
KKQAQVEKI I PQETTESSPTF 
KKQAQVEKI I PQETTE S S PTF 
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Figure 24: C reparative Sequences relating to SAG02 
(ABC transporter, substrate-binding protein) 

SEQ ID NO. 2401: SAG0290 FROM THE 1169NT1 GBS TYPE V STRAIN (REVERSE 
COMPLEMENT ) 

GTATCAGTTCAGGCGTCAGAGAAAGTAGAACTTAAAGTAGCTACAGATTCTGACACGGCACCATTTACTTA 
TCAAAAAGACGGGAAATTCAAAGGTTATGATGTTGATGTTGTCAAAGCTGTTTTTAAAGGTAGTAAGTACA 
• R -»^ ml .^^^mm/-aa^ananTTrrTT'PTf?ATnrTATTTCAACAGGTATTGATGCAGGGAAATTTGATTTATCA 



TCAAAAAGACGGGAAATTCAAAGGTTATGATGTTGATGTTGTCAAAeCTGTTTTTAAA^UTA^iMft^i"'-" 
AAGTAACCTTCAAGACAGTTCCTTTTGATACTATTTCAACAGGTATTGATGCAGGGAAATTTGATTTATCA 
GCTAATGATTTTTCATACAATAAAGAAAGAGCAGAAAAATATCTCTTCTCAGACCCTATATCCCGTTCAAA 
TTATGCCGTAGTAGGGAAGAAGGGGAGCCATTACAAATCATTAAGTGACCTCTCTGGAAAATCAACAGAAG 
TTTTATCTGGCGTTAACTATGCACAGGTTCTAGAAAATTGGAATAAAAATCATCCTAATAAAAAACCAATA 
AAAATCAAATATGTTTCTGGGACAACTGGTGTTACTAGCAGATTAAAAAATATTGAGAGTGGGAAAATTGA 
CTTTATCCTATATGATGCCATTTCATCTGACTATATTGTAAAAGATCAATCATTAAACTTAAGCGTTTCTC 
CTTTGAAAGGTAAAATTGGTAATAATAAGGATGGATTAGAATACCTCCTTTTACCAAAAGATAAAAAAGGT 
AAAACTCTACAGAAATTTATAAATAAGCGTATTAAAGTTTTGAAAGAAGATGGTACTTTGGCACGTTTAAG 
TAAACAATATTTCGGTGGAGATTACGTTTCAAACATTGATAAA 

SEQ ID HO. 2402: SAG0290 FROM THE 18RS21 GBS TYPE II STRAIN (REVERSE 
COMPLEMENT ) 

GTATCAGTTCAGGCGTCAGAGAAAGTAGAACTTAAAGTAGCTACAGATTCTGACACGGCACCATTTACTTA 
mT ,-» » „ „ * .~*/-r^A » an»r.oaaar:r=TTaTr:aTf5TTGATGTTGTCAAAGCTGTTTTTAAAGGTAGTAAGTACA 



GTATCAGTTCAGGCGTCAGAGAAAGTAGAACTTAAAGTAGCTACAGATTCTeACAUUto^A^Ai j. ia^i 
TRAAAAAGACGGGAAATTCAAAGGTTATGATGTTGATGTTGTCAAAGCTGTTTTTAAAGGTAGTAAGTACA 
AAGTAACCTTCAAGACAGTTCCTTTTGATACTATTTCAACAGGTATTGATGCAGGGAAATTTGATTTATCA 
GCTAATGATTTTTCATACAATAAAGAAAGAGCAGAAAAATATCTCTTCTCAGATCCTATATCCCGTTCAAA 
TTATGCCGTAGTAGGGAAGAAGGGGAGCCATTACAAATCATTAAGTGACCTCTCTGGAAAATCAACCGAAG 
TTTTATCTGGCGTTAACTATGCACAGGTTCTAGAAAATTGGAATAAAAATCATCCTAATAAAAAACCAATA 
AAAATCAAATATGTTTCTGGGACAACTGGTGTTACTAGCAGATTAAAAAATATTGAGAGTGGGAAAATTGA 
CTTTATCCTATATGATGCCATTTCATCCGACTATATTGTAAAAGACCAATCATTAAACTTAAGCGTTTCTC 
CTTTGAAAGGTAAAATTGGTAATAATAAGGATGGACTAGAATACCTCCTTTTACCAAAAGATAAAAAAGGT 
AAAACTCTACAGAAATTTATAAATAAGCGTATTAAAGTTTTGAAAGAAAATGGTACTTTGGCACGTTTAAG 

TAAACAATATTTCGGTGGAGATTACGTTTCAAACATTGATAAA 



SEQ ID NO. 2403: SAG0290 FROM THE 2603 V/R GBS TYPE V STRAIN (REVERSE 
COMPLEMENT) 

ATTCAAAGGTTATGATGTTGATGTTGTCAAAGCTGTTTTTAAAGGTAGTAAGTACAAAGTAACCTTCAAGA 
CAGTTCCTTTTGATACTATTTCAACAGGTATTGATGCAGGGAAATTTGATTTATCAGCTAATGATTTTTCA 
TACAATAAAGAAAGAGCAGAAAAATATCTCTTCTCAGATCCTATATCCCGTTCAAATTATGCCGTAGTAGG 
GAAGAAGGGGAGCCATTACAAATCATTAAGTGACCTCTCTGGAAAATCAACCGAAGTTTTATCTGGCGTTA 
ACTATGCACAGGTTCTAGAAAATTGGAATAAAAATCATCCTAATAAAAAACCAATAAAAATCAAATATGTT 
TCTGGGACAACTGGTGTTACTAGCAGATTAAAAAATATTGAGAGTGGGAAAATTGACTTTATCCTATATGA 
TGCCATTTCATCCGACTATATTGTAAAAGACCAATCATTAAACTTAAGCGTTTCTCCTTTGAAAGGTAAAA 
TTGGTAATAATAAGGATGGACTAGAATACCTCCTTTTACCAAAAGATAAAAAAG 
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Figure 24: Comparative Sequences relating to SAGO! 
(ABC transporter, substrate-binding protein) 





SEQ ID NO. 2405: 3AG0290 FROM THE* A909 GBS TYPE Xa STRAIN (REVERSE 
COMPLEMENT) 

GTATCAGTTCAGGCGTCAGAGAAAGTAGAACTTAAAGTAGCTACAGATTCTGACACGGCACCATTTACTTA 
TCAAAAAGACGGGAAATTCT^AAGGTTATGATGTTGATGTTGTCAAAGCTGTTTTTAAAGGTAGTAAGTACA 
AAGTAACCTT C AAG ACAGTTCCT TT TG AT ACT ATT T CAACAGGTATTGATGCAGGGAAATTT GAT TT ATCA 
GCTAATGATTTTTCATACAATAAAGAAAGAGCAGAAAAATATCTCTTCTCAGATCCTATATCCCGTTCAAA 
TT AT GCCGT AGT AGGGAAGAAGGGGAGCC ATT ACAAATC AT T AAGTGACCT CTCTGGAAAAT C AACCG AAG 
. TTTTATCTGGCGTTAACTATGCACAGGTTCTAGA2VAATTGGAATAAAAATCATNNTAATAAAAAACCANTA 
AAAATNAAATATGTTTCTGGGACAACTGGTGTTACTAGCAGATTAAAAAATATTGAGAGTGGGAAAATTGA 
CTTTATCCTATATGATGCCATTTCATCCGACTATATTGTAAAAGACCAATCATTAAACTTAAGCGTTTCTC 
CTTTGAAAGGTAAAATTGGTAATAATAAGGATGGACTAGAATACCTCCTTTTACCAAAAGATAAAAAAGGT 
AAAACTCTACAGAAATTTATAAATAAGCGT 

SEQ ID NO. 2406: SAG0290 FROM THE CJB110 GBS NONTYPEABLE STRAIN (REVERSE 
COMPLEMENT ) 

G^ATCAGTTCAGGCGTCAGAGAAAGTAGAACTTAAAGTAGCTACAGATTCTGACACGGCACCATTTACTTA 
TCAAAAAGACGGGAAATTCAAAGGTTATGATGTTGATGTTGTCAAAGCTGTTTTTAAAGGTAGTAAGTACA 
AAGTAACCTT CAAGACAGTT CCTTTT GAT ACTATTTCAACAGGTATTGATGCAGG GAAATTT GAT TTATCA 
GCT AATGAT TTTTCATACAATAAAGAAAGAGCAGAAAAATAT CTCTTCT C AGATCCTATAT CCCGTTCAAA 
T TATGCCGTAGT AGGGAAGAAGGGGAGCCATTACAAATCATTAAGTGACCTCTCTGGAAAAT C AACCGAAG 
TTTTATCTGGCGTTAACTATGCACAGGTTCTAGAAAATTGGAATAAAAATCATCCTAATAAAAAACCAATA 
AAAATCAAATATGTTTCT GGGACAACTGGTGTTACTAGCAGATT AAAAAATATTGAGAGT GGGAAAATTGA 
CTTTATCCTATATGATGCCATTTCATCCGACTATATTGTAAAAGACCAATCATTAAACTTAAGCGTTTCTC 
CTTTGAAAGGTAA7VATTGGTAATAATAAGGATGGACTAGAATACCTCCTTTTACCAAAAGATAAAAAAGGT 
AAAACTCTACAGAAAT TT AT AAAT AAGCGTAT T AAAGTTTTGAAAGAAAATG GT ACTTTGGCACGTT T AAG 
TAAACAATATTTCGGTGGAGATTACGTTTCAAACATTGATAAA 

SEQ ID NO. 2407: SAG0290 FROM THE COH1 GBS TYPE III STRAIN (REVERSE 
COMPLEMENT) 

GTATCAGTTCAGGCGTCAGAGAAAGTAGAACTTAAAGTAGCTACAGATTCTGACACGGCACCATTTACTTA 
TCAAAAAGACGGGAAATTC AAAGGT TAT GACGTTGATGTTGT C AAAGCT GT TTTT AAAGGTAGT AAGT ACA 
AAGT AACCT TCAAGACAGTT CCT TTTGAT ACTATT TC AACAGGT ATTG ATGCAGGGAAATTTGATTT ATCA 
GCTAATGATTTTTCAT ATAAT AAAGAAAGAGC AGAAAAAT ATCT CTTCTCAGATCCT ATATCCCGTT C AAA 
TTATGCCGTAGTAGGGAAGAAGGGGAGCCATTACAAATCATTAAGTGACCTCTCTGGAAAATCAACAGAAG 
TTTT ATCTGGCGTTAACTATGCACAGGTT CT AGAAAATTGGAAT AAAAAT CATCCT AATAAAAAACC AAT A 
AAAATCAAATATGTTTCTGGGACAACTGGTGTTACTAGCAGATTAAAAAATATTGAGAGTGGAA7VAATTGA 
CTTTATCCTATATGATGCCATTTCATCTGACTATATTGTAAAAGATCAATCATTAAACTTAAGCGTTTCTC 
CTTTGAAAGGTAAAATTGGTAATAATAAGGATGGATTAGAATACCTCCTTTTACCAAAAGATAAAAAAGGT 
AAAACTCTACAGAAATTTAT AAAT AAGCGTAT TAAAGTTTTGAAAGAAGAT GGTACTTTGGCACGTTTAAG 
TAAACAATATTTCGGTGGAGATTACGTTTCAAACATTGATAAA 

SEQ ID NO. 2408: SAG0290 FROM THE H36b GBS TYPE lb STRAIN 
(REVERSE COMPLEMENT) 

GTATCAGTTCAGGCGTCAGAGAAAGTAGAACTTAAAGTAGCTACAGATTCTGACACGGCACCATTTACTTA 
TCAAAAAGACGGGAAATTCAAAGGT TAT GATGTTGATGT TGTC AAAGCTGT T TTT AAAGGT AGTAAGTAC A 
AAGTAACCTTCAAGACAGTTCCTTTTGATACTATTTCAACAGGTATTGATGCAGGGAAATTTGATTTATCA 
GCTAATGATTTTTCATACAATAAAGAAAGAGCAGAAAAATATCTCTTCTCAGATCCTATATCCCGTTCAAA 
TTATGCCGTAGTAGGGAAGAAGGGGAGCCATTACAAATC^TTAAGTGACCTCTCTGGAAAATCAACCGAAG 
TTTTATCTGGCGTTAACTATGCACAGGTTCTAGAAAATTGGAATA7^AAATCATCCTAATAAAAAACCAATA 
AAAATCA2\ATATGTTTCTGGGACAACTGGTGTTACTAGCAGATTAAAAAATATTGAGAGTGGGAAAATTGA 
CTTTATCCTATATGATGCCATTTCATCCGACTATATTGTAAAAGACCAATCATTAAACTTAAGCGTTTCTC 
CTTTGAAAGGTAAAATTGGTAATAATAAGGATGGACTAGAATACCTCCTTTTACCAAAAGATAAAAAAGGT 
AAAACTCTACAGAAATTTATAl^ATAAGCGTATTAAAGTTTTGAAAGAAAATGGTACTTTGGCACGTTTAAG 
TAAACAATATTTCGGTGGAGATTACGTTTCAAACATTGATAAA 
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Figure 24: Comparative Sequences relating to SAGO! 
(ABC transporter, substrate-binding protein) 





SEQ ID NO. 2409: SA60290 FROM THE JM9130013 GBS STRAIN VIII (REVERSE 
COMPLEMENT) 

GTATCAGTTCAGGCGTCAGAGAAAGTAGAACTTAAAGTAGCTACAGATTCTGACACGGCACCATTTACTTA 
TCAAAAAGACGGGAAATTCAAAGGTTATGATGTTGATGTTGTCAAAGCTGTTTTTAAAGGTAGTAAGTACA 
AAGTAACCTTCAAGACAGTTCCTTTTGATACTATTTCAACAGGTATTGATGCAGGGAAATTTGATTTATCA 
G CTAATGAT TTT T CATACAAT AAAGAAAGAGC AGAAAAATAT CTCTTCTC AGATCCT AT AT CCCGT TCAAA 
TTATGCCGT AGT AGGGAAGAAGGGGAGCCATTAC AAATC AT T AAGT GACCT CT CTGG AAAATCAAC CGAAG 
TTTTAT CTGGCGTT AACT ATGCAC AGGT TCT AGAAAATTGGAATAAAAAT C ATCCTAAT AAAAAACCAAT A 
AAAATCAAATATGTTTCTGGGACAACTGGTGTTACTAGCAGATTAAAAAATATTGAGAGTGGGAAAATTGA 
CTTTATCCTATATGATGCCATTTCATCCGACTATATTGTAAAAGACC2\ATCATTAAACTTAAGCGTTTCTC 
CTTTGAAAGGTAAAATTGGTAATAATAAGGATGGACTAGAATACCTCCTTTTACCAAAAGATAAAAAAGGT 
AAAACTCT ACAGAAATTTAT AAAT AAGC GTAAT AAAGTT TTGAAAGAAAATGGT A 

SEQ ID NO. 2410: SAG0290 FROM THE M732 GBS TYPE III STRAIN (REVERSE 
COMPLEMENT) 

GTATCAGT TC AGGCGTCAGAGAAAGTAGAACTTAAAGT AGCT AC AGATT CTGAC ACGGC ACC ATT TACT T A 
TCAAAAAGAC GGGAAAT TCAAAGGT TAT GACGTTGATGTTGT CAAAGCTGTTTTT AAAGGT AGT AAGT AC A 
AAQTAACCT TC AAGAC AGTT CCTT TT GATACTATT T C AACAGGT ATTGAT GC AGG GAAATTTGATTT ATCA 
GCTAATGATTTTTCATATAATAAAGAAAGAGCAGAAAAATATCTCTTCTCAGATCCTATATCCCGTTCAAA 
TTATGCCGTAGTAGGG AAGAAGGGG AGCC ATT ACAAATC AT TAAGTGACCTCTCT GG AAAATCAAC AGAAG 
TTTTATCTGGCGTTAACTATGCACAGGTTCTAGAAAATTGGAATAAAAATCATCCTAAT7VAAAAACCAATA 
AAAATCAAATATGT TTCTGGGACAACTGGT GT T ACTAGCAG AT T AAAAAATATTGAGAGTGGAAAAATT GA 
CTTT AT CCTAT ATGATGC C ATT TCATCT GACTAT ATT GTAAAAGATCAATC ATTAAACTTAAGCGTT TCTC 
CTTTGAAAGGTAAAATTGGTAATAATAAGGATGGATTAGAATACCTCCTTTTACCAAAAGATAAAAAAGGT 
AAAACTCTAC AGAAATTT AT AAATAAGCGTATT AAAGTTT TGAAAGAAG ATGGT ACTTT GGCACGTT T AAG 
TAAACAATATTTCGGTGGAGATTACGTTTCAAACATTGATAAA 

SEQ ID NO. 2411: SAG0290 FROM THE M781 GBS TYPE III STRAIN (REVERSE 
COMPLEMENT) 

GT ATCAGTTCAGGC GTC AGAGAAAGT AGAACTTAAAGT AGCT ACAGATT CT GACACGGCACC ATT TACTTA 
TCAAAAAGACGGGAAATTCAAAGGTTATGACGTTGATGTTGTCAAAGCTGTTTTTAAAGGTAGTAAGTACA 
AAGTAACCTTCAAGACAGTTCCTTTTGATACTATTTCAACAGGTATTGATGCAGGGAAATTTGATTTATCA 
GCTAATGATTTTTCATATAATAAAGAAAGAGCAGAAAAATATCTCTTCTCAGATCCTATATCCCGTTCAAA 
TT ATGCCGTAGT AGGGAAG AAGGGGAGCCATTACAAAT CATTAAGTGACCTCT CTGGAAAATCAAC AGAAG 
TTTTATCTGGCGTTAACTATGCACAGGTTCTAGAAAATTGGAATAAAAATCATCCTAATAAAAAACCAATA 
AAAATCAAAT ATGTTTCTG GGACAACTGGTGTTACTAGCAG ATTAAAAAATAT TGAGAGT GGAAAAATTGA 
CTTTATCCTATATGATGCCATTTCATCTGACTATATTGTAAAAGATCAATCATTAAACTTAAGCGTTTCTC 
CTTTGAAAGGTAAAATT GGTAAT AAT AAGGATGGATT AGAATACCT CCTTTT ACCAAAAGATAAAAAAGGT 
AAAACTCTACAGAAATTTATAAATAAGCGTATTAAAGTTTTGAAAGAAGATGGTACTTTGGCACGTTTAAG 
TAAACAATATTTCGGTGGAGATTACGTTTCl^AACATTGATAAA 

SEQ2401 TATCAGTTCAGGCGTCAGAGAAAGTAGAACTTAAAGTAGCTACAGATTCTGACACGGCA 
SEQ2402 TATCAGTTCAGGCGTCAGAGAAAGTAGAACTTAAAGTAGCTACAGATTCTGACACGGCA 



SEQ2403 
SEQ2404 
SEQ2405 
SEQ2406 
SEQ2407 
SEQ2408 
SEQ2409 
SEQ2410 
SEQ2411 



TATCAGTTCAGGCGTCAGAGAAAGTAGAACTTAAAGTAGCTACAGATTCTGACACGGCA 
TATCAGTTCAGGCGTCAGAGAAAGTAGAACTTAAAGTAGCTACAGATTCTGACACGGCA 
TATCAGTTCAGGCGTCAGAGAAAGTAGAACTTAAAGTAGCTACAGATTCTGACACGGCA 
TATCAGTTCAGGCGTCAGAGAT^AGTAGAACTTAAAGTAGCTACAGATTCTGACACGGCA 
TATCAGTTCAGGCGTCAGAGAAAGTAGAACTTAAAGTAGCTACAGATTCTGACACGGCA 
TATCAGTTCAGGCGTCAGAGAAAGTAGAACTTAAAGTAGCTACAGATTCTGACACGGCA 
TATCAGTTCAGGCGTCAGAGA7\AGTAGAACTTAAAGTAGCTACAGATTCTGACACGGCA 
TATCAGTTCAGGCGTCAGAGAAAGTAGAACTTAAAGTAGCTACAGATTCTGACACGGCA 



SE02401 
SEQ2402 
SEQ2403 
SEQ2404 
SEQ2405 
SBQ2406 
SEQ2407 
SEQ2408 
SEQ2409 



CATTTACTTATCAAAAAGACGGGAAATTCAAAGGTTATGATGTTGATGTTGTCAAAGCT 
CATTTACTTATRAA7U\AGACGGGAAATTCAAAGGTTATGATGTTGATGTTGTCAAAGCT 



•ATTCAAAGGTTATGATGTTGATGTTGTCAAAGCT 



CATTTACTTATCAAAAAGACGGGAAATTCAAAGGTTATGATGTTGATGTTGTCAAAGCT 
CATTTACTTATCAAAAAGACGGGAAATTCAAAGGTTATGATGTTGATGTTGTCAAAGCT 
CATTTACTTATCAAAAAGACGGGAAATTCAAAGGTTATGATGTTGATGTTGTCAAAGCT 
CATTTACTTATCAAAAAGACGGGAAATTCAAAGGTTATGACGTTGATGTTGTCAAAGCT 
CATTTACTTATCAAAAAGACGGGAAATTCAAAGGTTATGATGTTGATGTTGTCAAAGCT 
CATTTACTTATCAAAAAGACGGGAAATTCAAAGGTTATGATGTTGATGTTGTCAAAGCT 
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Figure 24: T^omparative Sequences relating to SAG' 
(ABC transporter, substrate-binding protein) 



SEQ2410 CATTTACTTATCAAAAAGACGGGAAATTCAAAGGTTATGACGTTGATGTTGTCAAAGCT 

SEQ24 11 CATTTACTTATCAAAAAGACGGGAAATTCAAAGGTTATGACGTTGATGTTGTCAAAGCT 

SEQ2401 GTTTTTAAAGGTAGTAAGTACAAAGTAACCTTCAAGACAGTTCCTTTTGATACTATTTCA 

SEQ2402 GTTTTTAAAGGTAGTAAGTACAAAGTAACCTTCAAGACAGTTCCTTTTGATACTATTTCA 

SEQ2403 GTT TTTAAAGGTAGTAAGTACAAAGTAACCTTCAAGACAGTTCCTTTTGATACTATTTCA 

SEQ2404 GTTTTTAAAGGTAGTAAGTACAAAGTAACCTTCAAGACAGTTCCTTTTGATACTATTTCA 

SEQ2405 GTTTTTAAAGGTAGTAAGTACAAAGTAACCTTCAAGAGAGTTCCTTTTGATACTATTTCA 

SEQ240 6 GTTTTTAAAGGTAGTAAGTACAAAGTAACCTTCAAGACAGTTCCTTTTGATACTATTTCA 

SEQ2407 GTTTTTAAAGGTAGTAAGTACAAAGTAACCTTCAAGACAGTTCCTTTTGATACTATTTCA 

SEQ2408 GTTTTTAAAGGTAGTAAGTACAAAGTAACCTTCAAGACAGTTCCTTTTGATACTATTTCA 

SEQ2409 GTTTTTAAAGGTAGTAAGTACAAAGTAACCTTCAAGACAGTTCCTTTTGATACTATTTCA 

SEQ2410 GTTTTTAAAGGTAGTAAGTACAAAGTAACCTTCAAGACAGTTCCTTTTGATACTATTTCA 

SEQ2411 GTTTTTAAAGGTAGTAAGTACAAAGTAACCTTCAAGACAGTTCCTTTTGATACTATTTCA 

SEQ2401 ACAGGTATTGATGCAGGGAAATTTGATTTATCAGCTAATGATTTTTCATACAATAAAGAA 

SEQ2402 ACAGGTATTGATGCAGGGAAATTTGATTTATCAGCTAATGATTTTTCAT^CAATAAAGAA 

SEQ2403 ACAGGTATTGATGCAGGGAAATTTGATTTATCAGCTAATGATTTTTCATACAATAAAGAA 

SEQ24 0 4 ACAGGTATTGATGCAGGGAAATTTGATTTATCAGCTAATGATTTTTCATACAATAAAGAA 

SEQ2405 ACAGGTATTGATGCAGGGAAATTTGATTTATCAGCTAATGATTTTTCATACAATAAAGAA 

SEJQ2406 ACAGGTATTGATGCAGGGAAATTTGATTTATCAGCTAATGATTTTTCATACAATAAAGAA 

SEQ2407 ACAGGTATTGATGCAGGGAAATTTGATTTATCAGCTAATGATTTTTCATATAATAAAGAA 

SEQ2408 ACAGGTATTGATGCAGGGAAATTTGATTTATCAGCTAATGATTTTTCATACAATAAAGAA 

SEQ2409 ACAGGTATTGATGCAGGGAAATTTGATTTATCAGCTAATGATTTTTCATACAATAAAGAA 

SEQ2410 ACAGGTATTGATGCAGGGAAATTTGATTTATCAGCTAATGATTTTTCATATAATAAAGAA 

SEQ2411 ACAGGTATTGATGCAGGGAAATTTGATTTATCAGCTAATGATTTTTCATATAATAAAGAA 

SEQ2401 AGAGCAGAAAAATATCTCXTCTCAGACCCTATATCCCGTTCAAATTATGCCGTAGTAGGG 

SEQ2402 AGAGCAGAAAAATATCTCTTCTCAGATCCTATATCCCGTTCAAATTATGCCGTAGTAGGG 

SEQ2403 AGAGCAGAAAAATATCTCTTCTCAGATCCTATATCCCGTTCAAATTATGCCGTAGTAGGG 

SEQ2404 AGAGCAGAAAAATATCTCTTCTCAGATCCTATATCCCGTTCAAATTATGCCGTAGTAGGG 

SEQ2405 AGAGCAGAAAAATATCTCTTCTCAGATCCTATATCCCGTTCAAATTATGCCGTAGTAGGG 

SEQ2406 AGAGCAGAAAAATATCTCTTCTCAGATCCTATATCCCGTTCAAATTATGCCGTAGTAGGG 

SEQ2407 AGAGCAGAAAAATATCTCTTCTCAGATCCTATATCCCGTTCAAATTATGCCGTAGTAGGG 

SEQ2408 AGAGCAGAAAAATATCTCTTCTCAGATCCTATATCCCGTTCAAATTATGCCGTAGTAGGG 

SEQ2409 AGAGCAGAAAAATATCTCTTCTCAGATCCTATATCCCGTTCAAATTATGCCGTAGTAGGG 

SEQ2410 AGAGCAGAAAAATATCTCTTCTCAGATCCTATATCCCGTTCAAATTATGCCGTAGTAGGG 

SEQ2411 AGAGCAGAAAAATATCTCTTCTCAGATCCTATATCCCGTTCAAATTATGCCGTAGTAGGG 

SEQ2401 AAGAAGGGGAGCCATTACAAATCATTAAGTGACCTCTCTGGAA^TCAACAGAAGTTTTA 

SEQ2402 AAGAAGGGGAGCCATTACAAATCATTAAGTGACCTCTCTGGAAAATCAACCGAAGTTTTA 

SEQ2403 AAGAAGGGGAGCCATTACAAATCATTAAGTGACCTCTCTGGAAAATGAACCGAAGTTTTA 

SEQ2404 AAGAAGGGGAGCCATTACAAATCATTAAGTGACCTCTCTGGAAAATCAACCGAAGTTTTA 

SEQ2405 AAGAAGGGGAGCCATTACAAATCATTAAGTGACCTCTCTGGAAAATCAACCGAAGTTTTA 

SEQ2406 AAGAAGGGGAGCCATTACAAATCATTAAGTGACCTCTCTGGAAAATCAACCGAAGTTTTA 

SEQ2407 AAGAAGGGGAGCCATTACAAATCATTAAGTGACCTCTCTGGAAAATCAACAGAAGTTTTA 

SEQ2408 AAGAAGGGGAGCCATTACAAATCATTAAGTGACCTCTCTGGAAAATCAACCGAAGTTTTA 

SEQ2409 AAGAAGGGGAGCCATT ACAAAT CATTAAGTGACCT CTCTGG AAAATC AACCGAAGTTTT A 

SEQ2410 AAGAAGGGGAGCCATTACAAATCATTAAGTGACCTCTCTGGAAAATCAACAGAAGTTTTA 

SEQ2411 AAGAAGGGGAGCCATTACAAATCATTAAGTGACCTCTCTGGAAAATCAACAGAAGTTTTA 

SEQ2401 TCTGGCGTTAACTATGCACAGGTTCTAGAAAATTGGAATAAAAATCATCCTAATAAAAAA 

SEQ2402 TCTGGCGTTAACTATGCACAGGTTCTAGAAAATTGGAATAAAAATCATCCTAATAAAAAA 

SEQ2403 TCTGGCGTTAACTATGCACAGGTTCTAGAAAATTGGAATAAAAATCATCCTAATAAAAAA 

SEQ2404 TCT GGCGTT AACTATGCACAGGTTCTAG AAAATTGG AATAAAAAT CATCCT AAT AAAAAA 

SEQ2405 TCTGGCGTTAACTATGCACAGGTTCTAGAAAATTGGAATAAAAATCATNNTAATAAAAAA 

SEQ2406 TCTGGCGTTAACTATGCACAGGTTCTAGAAAATTGGAATAAAAATCATCCTAATAAAAAA 

SEQ2 407 TCTGGCGTTAACTATGCACAGGTTCTAGAAAATTGGAATAAAAATCATCCTAATAAA7UWV 

SEQ2408 T CTGGCGTTAACTATGCACAGGTTCT AGAAAAT TGGAAT AAAAAT CATCCT AATAAAAAA 

SEQ2409 TCTGGCGTTAACTATGCACAGGTTCTAGAAAATTGGAATAAAAATCATCCTAATAAAAAA 

SEQ2410 TCTGGCGTTAACTATGCACAGGTTCTAGAAAATTGGAATAAAAATCATCCT AATAAAAAA 

SEQ2411 TCTGGCGTTAACTATGCACAGGTTCTAGAAAATTGGAATAAAAATCATCCTAATAAAAAA 

SEQ2401 CCAATAAAAATC^UVATATGTTTCTGGGACAACTGGTGTTACTAGCAGATTAAAAAATATT 

SEQ2402 CCAATAAAAATCAAATATGTTTCTGGGACAACTGGTGTTACTAGCAGATTAAAAAATATT 

SEQ2403 CCAATAAAAATCAAATATGTTTCTGGGACAACTGGTGTTACTAGCAGATTAAAAAATATT 

SEQ2404 CCAATAAAAATCAAATATGTTTCTGGGACAACTGGTGTTACrAGCAGATTAAAAAATACT 

3EQ2405 CCANTAAAAATNAAATATGTTTCTGGGACAACTGGTGTTACTAGCAGATTAAAAAATATT 

SEQ2406 CCAATAAAAATCAAATATGTTTCTGGGACAACTGGTGTTACTAGCAGATTAAAAAATACT 

SEQ2407 CCAATAAAAATCAAATATGTTTCTGGGACAACTGGTGTTACTAGCAGATTAAAAAATATT 

SEQ2408 CCAATAAAAATCAAATATGTTTCTGGGACAACTGGTGTTACTAGCAGATTAAAAAATAT't 

8EQ2409 CXIAATAAA7\ATCAAATATGTTTCTGGGACAACTGGTGTTACTAGCAGATTAAAAAATATT 
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CCAATAAAAATCAAATATGTTTCTGGGACAACTGGTGTTACTAGCAGATTAAAAAATATT 
CCAATAAAAATCAAATATGTTTCTGGGACAACTGGTGTTACTAGCAGATTAAAAAATATT 

GAGAGTGGGAAAATTGACTTTATCCTATATGATGCCATTTCATCTGACTATATTGTAAAA 
GAGAGTGGGAAAATTGACTTTATCCTATATGATGCCATTTCATCCGACTATATTGTAAAA 
GAGAGTGGGAAAATTGACTTTATCCTATATGATGCCATTTCATCCGACTATATTGTAAAA 
GAGAGTGGGAAAATTGACTTTATCCTATATGATGCCATTTCATCCGACTATATTGTAAAA 
GAGAGTGGGAAAATTGACTTTATCCTATATGATGCCATTTCATCCGACTATATTGTAAAA 
GAGAGTGGGAAAATTGACTTTATCCTATATGATGCCATTTCATCCGACTATATTGTAAAA 
GAGAGTGGAAAAATTGACTTTATCCTATATGATGCCATTTCATCTGACTATATTGTAAAA 
GAGAGTGGGAAAATTGACTTTATCCTATATGATGCCATTTCATCCGACTATATTGTAAAA 
GAGAGTGGGAAAATTGACTTTATCCTATATGATGCCATTTCATCCGACTATATTGTAAAA 
GAGAGTGGAAAAATTGACTTTATCCTATATGATGCCATTTCATCTGACTATATTGTAAAA 
GAGAGTGGAAAAATTGACTTTATCCTATATGATGCCATTTCATCTGACTATATTGTAAAA 

GATCAATCATTAAACTOAAGCGTTTCTCCTTTGAAAGGTAAAATTGGTAATAATAAGGAT 
GACCAATCATTAAACTTAAGCGTTTCTCCTTTGAAAGGTAAAATTGGTAATAATAAGGAT 
GACCAATCATTAAACTTAAGCGTTTCTCCTTTGAAAGGTAAAATTGGTAATAATAAGGAT 
GACCAATCATTAAACTTAAGCGTTTCTCCTTTGAAAGGTAAAATTGGTAATAATAAGGAT 
GACCAATCATTAAACTTAAGCGTTTCTCCTTTGAAAGGTAAAATTGGTAATAATAAGGAT 
GACCAATCATTAAACTTAAGCGTTTCTCCTTTGAAAGGTAAAATTGGTAATAATAAGGAT 
GATCAATCATTAAACTTAAGCGTTTCTCCTTTGAAAGGTAT^AATTGGTAATAATAAGGAT 
GACCAATCATTAAACTTAAGCGTTTCTCCTTTGAAAGGTAAAATTGGTAATAATAAGGAT 
GACCAATCATTAAACTTAAGCGTTTCTCCTTTGAAAGGTAAAATTGGTAATAATAAGGAT 
GATCAATC^TTAAACTTAAGCGTTTCTCCTTTGAAAGGTAAAATTGGTAATAATAAGGAT 
GATCAATCATTAAACTTAAGCGTTTCTCCTTTGAAAGGTAAAATTGGTAATAATAAGGAT 

GGATTAGAATACCTCCTTTTACCAAAAGATAAAAAAGGTAAAACTCTACAGAAATTTATA 
GGACTAGAATACCTCCTTTTACCAAAAGATAAAAAAGGTAAAACTCTACAGAAATTTATA 

GGACTAGAATACCT CCTTTTACCAAAAGATAAAAAAG 

GGACTAGAATACCTCCTTTTACCAAAAGATAAAAAAGGTAAAACTCTACAGAAATTTATA 
GGACTAGAATACCTCCTTTTACCAAAAGATAAAAAAGGTAAAACTCTACAGAAATTTATA 
GGACTAGAATACCTCCTTTTACCAAAAGATAAAAAAGGTAAAACTCTACAGAAATTTATA 
GGATTAGAATACCTCCTTTTACCAAAAGATAAAAAAGGTAAAACTCTACAGAAATTTATA 
GGACTAGAATACCTCCTTTTACCAAAAGATAAAAAAGGTAAAACTCTACAGAAATTTATA 
GGACTAGAATACCTCCTTTTACCAAAAGATAAAAAAGGTAAAACTCTACAGAAATTTATA 
GGATTAGAATACCTCCTTTTACCAAAAGATAAAAAAGGTAAAACTCTACAGAAATTTATA 
GGATTAGAATACCTCCTTTTACCAAAAGATAAAAAAGGTAAAACTCTACAGAAATTTATA 

ATAAGCGTATTAAAGTTTTGAAAGAAGATGGTACTTTGGCACGTTTAAGTAAACAATAT 
ATAAGCGTATTAAAGTTTTGAAAGAAAATGGTACTTTGGCACGTTTAAGTAAACAATAT 



ATAAGCGTATTAAAGTTTTGAAAGAAAATGGTACTTTGGCACGTTTAAGTAAACAATAT 



ATAAGCGT 



ATAAGCGTATTAAAGTTTTGAAAGAZVAATGGTACTTTGGCACGTTTAAGTAAACAATAT 
ATAAGCGTATTAAAGTTTTGAAAGAAGATGGTACTTTGGCACGTTTAAGTAAACAATAT 
ATAAGCGTATTAAAGTTTTGAAAGAAAATGGTACTTTGGCACGTTTAAGTAAACAATAT 

ATAAGCGTAATAAAGTTTTGAAAGAAAATGGTA ~ 

ATAAGCGTATTAAAGTTTTGAAAGAAGATGGTACOTTGGCACGTTTAAGTAAACAATAT 
ATAAGCGTATTAAAGTTTTGAAAGAAGATGGTACTTTGGCACGTTTAAGTAAACAATAT 



TCGGTGGAGATTACGTTTCAAACATTGATAAA- 
TCGGTGGAGATTACGTTTCAAACATTGATAAA- 



TCGGTGGAGATT ACGTTT CAAACATTGATAAA- 



TCGGTGGAGATTACGTTTCAAACATTGATAAA 

TCGGTGGAGATTACGTTTCAAACATTGATAAA 

TCGGTGGAGATTACGTTTCAAACATTGATAAA 



TCGGTGGAGATTACGTTTCAAACATTGATAAA 

TCGGTGGAGATTACGTTTCAAACATTGATAAAGTRCMARATVSTNCSRATNGTSAGABC 
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Figure 24: C%aparative Sequences relating to SAG02! 
(ABC transporter, substrate-binding protein) 





SEQ2401 

SEQ2402 

SEQ2403 

SEQ2404 

SEQ2405 

SEQ2406 

SEQ2407 

SEQ2408 

SEQ2409 

SEQ2410 

SEQ2411 RANSRTRSTBSTRATBNDNGRTN 

>SEQ ID KO 2450: 8J1169NT frame: 1' 

VS VQAS EKVELKVATDSDTAP FTYQKDGKFKGYDVDWKAVFKGS KYKVT FKTVPFDTI S 
TGIDAGKFDLSANDFSYNKERAEKYLFSDPISRSNYAVVGKKGSHYKSLSDLSGKSTEVL 
SGWYAQVLENWNKNHPNKKPIKIKYVSGTTGVTSRLKNIESGKIDFILYDAISSDYIVK 
DQSLNLSVSPLKGKIGIWKDGLEYLLLPKDKKGKTLQKFINKRIECVXKEDGTLARLSKQY 

FGGDYVSNIDK 

>SEQ ID NO 2451:8 18RS21 frame: 1 

VSVQASEKVEI^ATDSDTAPFTYXKDGKreGYDVDVVKAVFKGSKYKVTFKTVPFDTIS 
TGIDAGKFDLSANDFSYNKERAEKYLFSDPISRSNYAWGKKGSHYKSLSDLSGKSTEVL 
SGVNYAQVLENWNKNHPNKKPIKIKYVSGTTGVTSRLKNIESGKIDFILYDAISSDYIVK 
DQSLNLSVSPI^GKIGNNKDGLEYLLLPKDKKGKTLQKFINKRIKVLKENGTLARLSKQY 
FGGDYVSNIDK 

>SEQ ID NO 2452:8^2603 frame: 2 

FKGYDVDWKAVFKGSKYKVTFKTVPFDTISTGIDAGKFDLSANDFSYNKERAEKYLFSD 
PISRSNYAWGKKGSHYKSLSDLSGKSTEVLSGVNYAQVLENWNKNHPNKKPIKIKYVSG 
TTGVTSRLKNIESGKIDFILYDAISSDYIVKDQSLNLSVSPLKGKIGNNKDGLEYLLLPK 

DKK 

>SEQ ID NO 2453:8_090 frame: 1 

VSVQASEKVELKVATDS DTAPFTYQKDGKFKGYDVDWKAVFKGSKYKVTFKTVPFDT I S 
TGIDAGKFDLSANDFSYNKERAEKYLFSDPISRSNYAWGKKGSHYKSLSDLSGKSTEVL 
SGVNYAQVIfiN WNKNHPNKKPIKIKYVSGTTGVTS RLKNIE SGKI DFILYDAI S SDYTVK 
DQSLNLSVSPLKGKIGNNKDGLEYLLLPKDKKGKTLQKFINKRIKVIjKENGTIjARLSKQY 

FGGDYVSNIDK 

>SEQ ID NO 2454:8_A909 frame: 1 

VSVQASEKVELKVATDSDTAPFTYQKDGKFKGYDVDVVKAVFKGSKYKVTFKTVPFDTIS 
TGIDAGKFDLSANDFSYNKERAEKYLFSDPISRSNYAWGKKGSHYKSLSDLSGKSTEVL 
SGWYAQVLENWNI05HXNKKPXKXKYVSGTTGVTSRLKNIESGKIDFILYDAISSDYIVK 
DQSLNLSVSPLKGKIGNNKDGLEYLLLPKDKKGKTLQKFINKR 

>SEQ ID NO 2455: 8jCJB110 frame: 1 

VSVQASEKVELKVATDSDTAPFTYQKDGKFKGYDVDWKAVFKG SKYKVT FKTVPFDT IS 
TGI DAGKFDLSANDFS YNKERAEKYLFS DPI SRSNYAWGKKGSHYKSLSDLSGKSTEVL . 
SGVNYAQVLENWNKNHPNKKPIKIKYVSGTTGVTSRLKNIESGKIDFILYDAISSDYIVK 
DQSLNLSVSPLKGKIGNNKDGLEYLLLPKDKKGKTLQKFINKRIKVLKENGTLARLSKQY 
FGGDYVSNIDK 

>SEQ ID NO 2456: 8 COHl frame: 1 

VSVQASEB^LKVATDSDTAPFTYQKDGKFKGYDVDVVKAVFKGSKYKVTE^^ 
TGI DAGKFDLSANDFS YNKERAEKYLFSDPISRSNYAVVGKKGSHYKSLSDLSGKSTEVL 
SGVNYAQVLENWNKNHPNKKP IKIKYVS GTTGVT SRLKN I E S GKI DFI LY DAI S SDYIVK 
♦ DQSLNLSVS PLKGKIGNNBCDGLEYLLLPKDKKGKTLQKFINKRIKVLKEDGTLARLSKQY 
FGGDYVSNIDK 

>SEQ ID NO 2457:8 H36B frame: 1 

VSVQASEECVEIJCVATDSDTAPFTYQKDGKFKGYDVDVVKAVFKGSKYKVTFKTVPFDTIS 
TGI DAGKFDLS ANDFS YNKERAEKYLFS DP ISRSNYAWGKKGSHYKSLSDLSGKSTEVL 
SGVNYAQVLENWNKNHPNKKPIKIKYVSGTTGVTSRLKN IESGKI DFI LYDAIS S DYI VK 
DQSLNLSVSPLKGKIGNNKDGLEYLLLPKDKKGKTLQKFINKRIKVLKENGTLARLSKQY 

FGGDYVSNIDK 
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Figure 24: Comparative Sequences relating to SAG02 
(ABC transporter, substrate-binding protein) 



t 



>SKQ XD HO 2458:8 JM9130013 frame: 1 

VS VQASEKVELKVAT DSDTAPFT YQKDGKFKGYDVDWKAVFKGSKYKVT FKTVPFDT I S 
TGIDAGKFDLSANDFSYNKERAEKYLFSDPISRSNYAWGKKGSHYKSLSDLSGKSTEVL 
SGVNYAQVLENWNKNHPNKKPIKIKYVSGTTGVTSRLKNIESGKIDFILYDAISSDYIVK 
DQSLNLSVSPLKGKIGNNKDGLEYLLLPKDKKGKTLQKFINKRNKVLKENG 

>SEQ ID NO 2459:8_M732 frame: 1 

VSVQASEKVELKVATDSDTAP FTYQKDGKFKGYDVDWKAVFKGSKYKVT FKTVPFDT I S 
TGIDAGKFDLSANDFSYNKERAEKYLFSDPISRSNYAWGKKGSHYKSLSDLSGKSTEVL 
SGVNYAQVLENWNKNHPNKKPIKIKYVSGTTGVTSRLKNIESGKIDFILYDAISSDYIVK 
DQSLNLSVSPLKGKIGNNKDGLEYLLLPKDKKGKTLQKFINKRIKVLKEDGTLARLSKQY 

FGGDYVSNIDK 

>S£Q ZD HO 2460:8 M781 frame: 1 

VSVQASEKVELKVAT DSDTAPFTYQKDGKFKGYDVDVVKAVFKGSKYKVT FKTVPFDTI S 
TGIDAGKFDLSANDFSYNKERAEKYLFSDPISRSNYAWGKKGSHYKSLSDLSGKSTEVL 
SGVNYAQVLENWNKNHPNKKPIKIKYVSGTTGVTSRLKNIESGKIDFILYDAISSDYIVK 
DQSLNLSVSPLKGKIGNNKDGLEYLLLPKDKKGKTLQKFINKRIKVLKEDGTLARI.SKQY 
FGGDYVSNIDK 

SEQ24 50 SVQASEKVELKVATDSDTAPFTYQKDGKFKGYDVDWKAVFKGSKYKVT FKTVPFDTI S 

SEQ2451 SVQASEKVELKVATDSDTAPFTYXKDGKFKGYDVDVVKAVFKGSKYECVTFKTVPFDTIS 

SEQ2452 FKGYDVDWKAVFKGSKYKVTFKTVPFDTIS 

SEQ2453 SVQASEKVELKVATDSDTAPETYQKDGKFKGYDVDVVKAVFKGSKYKVTFKTVPFDTIS 
SEQ2454 SVQASEKVELKVATDSDTAPFTYQKDGKFKGYDVDWKAVFKGSKYKVTFKTVPFDTIS 
SEQ2455 SVQASEKVELKVATDSDTAPFTYOKDGKFKGYDVDVVKAVFKGSKYKVTFKTVPFDTIS 
SEQ2456 SVQASEKVELKVATDSDTAPFTYQKDGKFKGYDVDWKAVFKGSKYKVTFKTVPFDTIS 
SEQ2457 SVQASEKVEIJWATDSDTAPFTYQKDGKFKGYDVDVVKAVFKGSKYKVTFKTVPFDTIS 
SEQ2458 S VQASEKVELKVATDS DT APFT YQKDGKFKGYDVDWKAVFKGSKYKVT FKTVPFDT I S 

SEQ2459 SVQASEKVELKVATDS DTAPFTYQKDGKFKG YDVDWKAVFKGSKYKVTFKTVPFDT I S 

SEQ2460 SVQASEKVELKVATDS DTAP FTYQKDGKFKGYDVDWKAVFKGSKYKVT FKTVPFDT I S 

SEQ2450 TGI DAGKFDLSANDFSYNKERAEKYLFSDPI S RSNYAWGKKGSHYKSLS DLSGKSTEVL 

SEQ2451 TGI DAGKFDLSANDFSYNKERAEKYLFSDPISRSNYAWGKKGSHYKSLS DLSGKSTEVL 

SEQ2452 TGI DAGKFDLSANDFSYNKERAEKYLFSDPISRSNYAWGKKGSHYKSLS DLSGKSTEVL 

SEQ2453 TGI DAGKFDLSANDFSYNKERAEKYLFSDPI SRSNYAWGKKGS HYKSLSDLSGKSTEVL 

SEQ2454 TGIDAGKFDLSANDFSYNKERAEKYLFSDPISRSNYAWGKKGSHYKSLSDLSGKSTEVL 

SEQ2455 TGIDAGKFDLSANDFSYNKERAEKYLFSDPISRSNYAWGKKGSHYKSLS DLSGKSTEVL 

SEQ2456 TGIDAGKFDLSANDFSYNKERAEKYLFSDPISRSNYAWGKKGSHYKSLSDLSGKSTEVL 

SEQ2457 TGI DAGKFDLSANDFSYNKERAEKYLFSDPI SRSNYAWGKKGSHYKSLS DLSGKSTEVL 

SEQ2458 TGIDAGKFDLSANDFSYNKERAEKYLFS DPI SRSNYAWGKKGSHYKSLS DLSGKSTEVL 

SEQ2459 TGIDAGKFDLSANDFSYNKERAEKYLFSDPISRSNYAWGKKGSHYKSLSDLSGKSTEVL 

SEQ2460 TGI DAGKFDLSANDFSYNKERAEKYLFSDP I S RSNYAWGKKGSHYKSLS DLSGKSTEVL 

SEQ2450 SGVNYAQVLENWNKNHPNKKPIKIKYVSGTTGVTSRLKNIESGKIDFILYDAISSDYIVK 

SEQ2451 SGVNYAQVLENWNKNHPNKKPI KIKYVSGTTGVTSRLKNIESGKI DFILYDAI SS DYI VK 

SEQ2452 SGVNYAQVI£NWNKNHPNKKPIKIKYVSGTTGTOSRLKNIESGKIDFILYDAISSDYIVK 

SEQ2453 SGVNYAQVLENWNKNHPNKKPIKIKYVSGTTGVTSRLKNIESGKIDFILYDAISSDYIVK 

SEQ2454 SGVNYAQVLENWNKNHXNKKPXKXKYVSGTTGVTSRLKNIESGKIDFILYDAISSDYIVK 

SEQ2455 SGVNYAQVLENWNKNHPNKKPIKIKYVSGTTGVTSRLKNIESGKIDFILYDAISSDYIVK 

SEQ2456 SGVNYAQVLENWNKNHPNKKPIKIKYVSGTTGVTSRLKNIESGKIDFILYDAISSDYIVK 

SEQ2457 SGVNYAQVLENWNKNHPNKKPIKIKYVSGTTGVTSRLKNIESGKI DFILYDAI SSDYIVK 

SEQ2458 SGVNYAQVLENWNKNHPNKKPI KIKYVSGTTGVTSRLKNIESGKI DFILYDAI SSDYIVK 

SEQ2459 SGVNYAQVLENWNKNHPNKKPIKIKYVSGTTGVTSRLKNIESGKI DFILYDAI SSDYIVK 

SEQ2460 SGVNYAQVLENWNKNHPNKKPIKIKYVSGTTGVTSRLKNIESGKIDFILYDAI SSDYIVK 

SEQ2450 DQSLNLSVS PLKGKIGNNKDGLEYLLLPKDKKGKT LQKFINKRI KVLKEDGTLARLSKQY 

SEQ2451 DQSLNLSVSPLKGKIGNNKDGLEYLLLPKDKKGKTLQKFINKRIKVLKENGTLARLSKQY 

SEQ2 452 DQSLNLSVS PLKGKIGNNKDGLEYLLLPKDKK 

SEQ2453 DQSLNLSVSPLKGKIGNNKDGLEYLLLPKDKKGKTLQKFINKRIKVLKENGTLARLSKQY 

SEQ2454 DQSLNLSVSPLKGKIGNNKDGLEYLLLPKDKKGKTLQKFINKR 

SEQ2455 DQSLNLSVSPLKGKIGNNKDGLEYLLLPKDKKGKTLQKFINKRIKVI»KENGTLARLSKQY 

SEQ2456 DQSLNLSVSPLKGKIGNNKDGLEYLLLPKDKKGKTLQKFINKRIKVLKEDGTLARLSKQY 

SEQ2457 0QSLNLSVSPLKGKIGNNKDGLEYLLLPKDKKGKTLQKFINKRIKVLKENGTLARLSKQY 

SEQ2458 DQSLNLSVSPLKGKIGNNKDGLEYLLLPKDKKGKTLQKFINKRNKVLKENG 

SEQ2459 DQS LNLSVS PLKGKIGNNKDGLE YLLLPKDKKGKTLQKFINKRIKVIjKE DGTLARLSKQY 

SEQ2 4 60 DQSLNLSVSPLKGKIGNNKDGI^YLLLPKDKKGKTLQKFINKRIKVLKEDGTLARLSKQY 
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Figure 24: Comparative Sequences relating to SAGO: 
(ABC transporter, substrate-binding protein) 
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Table 25: Comparative Sequences relating to SAG036! 
(protein of unknown function) 

SEQ ID NO. 2501: SAG0368 FROM THE 090 GBS TYPE la STRAIN 

TATAaTTTTTCGACTAATGAATTGTCTAAGACTTTTAAAGATTTTAAGCTAGCTAAATCAAAAAGTCATGCTATTGAA 

GAAACARAGCCGTTTTCAATACTATTAATGGGGGTGGACACAGGTTCAGAGCATCGAAAATCTAAGTGGTCAGGAAAT 

AGCGATTCTATGATCTTAGTCACTATAAATCCTAAAACTAATAAAACAACGATGACAAGCTTAGAACGTGACGTATTG 

ATTAAATTGAGTGGTCCCAAAAATAATGGACAGACTGGAGTAGAAGCAAAGCTAAATGCAGCCTATGCTTCTGGTGGT 

GCGGAAATGGCATTGATGACTGTTCAAGACTTATTAGATATTAATGTTGATTACTTTATGCAAATTAATATGCAAGGA 

TTAGTTGATTTAGTCAATGCTGTTGGTGGTATAACAGTAACTAATAAATTTGACTTTCCAATATCAATTGCTGCCAAT 

GAACCAGAGTACAAGGCTGTTGTTGAACCAGGGACACATAAAATAAATGGAGAACAAGCACTTGTTTATTCTCGTATG 

CGCTATGATGATCCAGAGGGAGATTATGGGCGTCAAAAAAGACAACGTGAAGTAATTCAAAAAGTCCTTAAAAAAATA 

TTGGCGTTAAATAGTATTAGTTCATACAAAAAAATTCTTTCCGCAGTAAGTAATAACATGCAAACTAATATTGAGATA 

TCATCAAAAACGATTCCTAATTTGTTAGCTTATAAAGATTCATTGGAACATATTAAATCTTATCAGTTGAAGGGTGAA 

GACGCTACTTTATCAGATGGTGGCTCTTATCAAATTTTAACTAAGAAACATCTACTTGCAGTTCAAAATAGAATTAAG 

AAAGAACTGGATAAAAAGCGTAGTAAAACTCTGAAGACAAGCGCGATTCTATATGAAGATTACTATGGTACTACTGCT 

AGTARTGATTCTTCTACTTATTCATCAACACAAGAGAATAATTATAATACAACACCTTATTCAGAAGCACCACCAAGT 

TACAGTGGTAATACTACTTATAGTTCTGAGACTAATCAAACAACTCATCAAAATTACTATAATAGTAGCACTCCTGCT 

AGTAACTATAGCAGTAACACTAACACAGGTCAGGCTGATTCAAGTGGAAGTGTCAATAATCATAACGGGGCTGCAACG 

CCTAATCCA 

SEQ ID NO. 2502: SAG0368 FROM THE 1169NT1 GBS TYPE V STRAIN 

TATAATTTTTCGACTAATGAATTGTCTAAGACTTTTAAAGATTTTARGCTAGCTAAATCAAAAAGTCATGCTATTGAA 

GAAACAAAGCCGTTTTCAATACTATTAATGGGGGTGGACACAGGTTCAGAGCATCGAAAATCTAAGTTGGTCAGGAAA 

TAGCGATTCTATGATCTTAGTCACTATAAATCCTAAAACTAATAAAACAACGATGACAAGCTTAGAACGTGACGTATT 

GATTAAATTGAGTGGTCCCAAAAATAATGGACAGACTGGCGTAGAAGCAAAGCTAAATGCAGCCTATGCTTCTGGTGG 

TGCGGAAATGGCATTGATGACTGTTCAAGACTTATTAGATATTAATGTTGATTACTTTATGCAAATTAATATGCAAGG 

ATTAGTTGATTTAGTCAATGCTGTTGGTGGTATAACAGTAACTAATAAATTTGACTTTCCAATATCAATTGCTGCCAA 

TGAACCAGAGTACAAGGCTGTTGTTGAACCAGGGACACATAAAATAAATGGAGAACAAGCACTTGTTTATTCTCGTAT 

GCGCTATGATGATCCAGAGGGAGATTATGGGCGTCAAAAAAGACAACGTGAAGTAATTCAAAAAGTCCTTAAAAAAAT 

ATTGGCGTTAAATAGTATTAGTTCATACAAAAARATTCTTTCCGCAGTAAGTAATAACATGCAAACTAATATTGAGAT 

ATCATCAAAAACGATTCCTAATTTGTTAGCTTATAAAGATTCATTGGAACATATTAAATCTTATCAGTTGAAAGGTGA 

AGACGCTACTTTATCAGATGGTGGCTCTTATCAAATTTTAACTAAGAAACATCTACTTGCAGTTCAAAATAGAATTAA 

GAAAGAACTAGATAAAARGCGTAGTAAAACTCTGAAGACAAGCGCGATTCTATATGAAGATTACTATGGTACTACTGC 

TAGTAATGATTCTTCTACTTATTCATCAACACAAGAGAATAATTATAATACAACACCTTATTCAGAAGCACCACCAAG 

TTACAGTGGTAATACTACTTATAGTTCTGAGACTAATCAAACAACTCATCAAAGTTACTATAATAGTAGCACTCCTGC 

TAATAACTATAGCAGTAACACTAACACAGGTCAGGCTGATTCAAGTGGAAGTGTCAATAATCATAATGGGGCTGCAAC 

GCCTAATCCA 

SEQ ID NO. 2503 SAG0368 FROM THE 18RS21 GBS TYPE II STRAIN 

TATAATTTTTCGACTAATGAATTGTCTAAGACTTTTAAAGATTTTAAGCTAGCTAAATCAAAAAGTCATGCTATTGAA 

GAARCAAAGCCGTTTTCAATACTATTAATGGGGGTGGACACAGGTTCAGAGCATCGAAAATCTAAGTGGTCAGGAAAT 

AGCGATTCTATGATCTTAGTCACTATAAATCCTAAAACTAATAAAACAACGATGACAAGCTTAGAACGTGACGTATTG 

ATTAAATTGAGTGGTCCCAAAAATAATGGACAGACTGGAGTAGAAGCAAAGCTAAATGCAGCCTATGCTTCTGGTGGT 

GCGGAAATGGCATTGATGACTGTTCAAGACTTATTAGATATTAATGTTGATTACTTTATGCAAATTAATATGCAAGGA 

TTAGTTGATTTAGTCAATGCTGTTGGTGGTATAACAGTAACTAATAAATTTGACTTTCCAATATCAATTGCTGCCAAT 

GAACCAGAGTACAAGGCTGTTGTTGAACCAGGGACACATAAAATAAATGGAGAACAAGCACTTGTTTATTCTCGTATG 

CGCTATGATGATCCAGAGGGAGATTATGGGCGTCAAAAAAGACAACGTGAAGTAATTCAAAAAGTCCTTAAAAAAATA 

TTGGCGTTAAATAGTATTAGTTCATACAAAAAAATTCTTTCCGCAGTAAGTAATAACATGCAAACTAATATTGAGATA 

TCATCAAAAACGATTCCTAATTTGTTAGCTTATAAAGATTCATTGGAACATATTAAATCTTATCAGTTGAAGGGTGAA 

GACGCTACTTTATCAGATGGTGGCTCTTATCAAATTTTAACTAAGAAACATCTACTTGCAGTTCAAAATAGAATTAAG 

AAAGAACTGGATAAAAAGCGTAGTAAAACTCTGAAGACAAGCGCGATTCTATATGAAGATTACTATGGTACTACTGCT 

AGTAATGATTCTTCTACTTATTCATCAACACAAGAGAATAATTATAATACAACACCTTATTCAGAAGCACCACCAAGT 

TACAGTGGTAATACTACTTATAGTTCTGAGACTAATCAAACAACTCATCAAAATTACTATAATAGTAGCACTCCTGCT 

AGTRACTATAGCAGTAACACTAACACAGGTCAGGCTGATTCAAGTGGAAGTGTCAATAATCATAACGGGGCTGCAACG 

CCTAATCCA 
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Table 25: C mparative Sequences relating to SAG0368 
(protein of unknown function) 

SEQ ID NO. 2504: SAG0368 FROM THE 2603 V/R GBS TYPE V STRAIN 

TATAATTTTTCGACTAATGAATTGTCTAAGACTTTTAAAGATTTTAAGCTAGCTAAATCAAAAAGTCATGCTATTGAA 

GAAACAAAGCCGTTTTCAATACTATTAATGGGGGTGGACACAGGTTCAGAGCATCGAAAATCTAAGTGGTCAGGAAAT 

AGCGATTCTATGATCTTAGTCACTATAAATCCTAAAACTAATAAAACAACGATGACAAGCTTAGAACGTGACGTATTG 

ATTAAATTGAGTGGTCCCAAAAATAATGGACAGACTGGAGTAGAAGCAAAGCTAAATGCAGCCTATGCTTCTGGTGGT 

GCGGAAATGGCATTGATGACTGTTCAAGACTTATTAGATATT2^TGTTGATTACTTTATGCAAATTAATATGCAAGGA 

TTAGTTGATTTAGTCAATGCTGTTGGTGGTATAACAGTAACTAATAAATTTGACTTTCCAATATCAATTGCTGCCAAT 

GAACCAGAGTACAAGGCTGTTGTTGAACCAGGGACACATAAAATAAATGGAGAACAAGCACTTGTTTATTCTCGTATG 

CGCTATGATGATCCAGAGGGAGATTATGGGCGTCAAAAAAGACAACGTGAAGTAATTCAAAAAGTCCTTAA7VAAAATA 

TTGGCGTTAAATAGTATTAGTTCATACAAAAAAATTCTTTCCGCAGTAAGTAATAACATGCAAACTAATATTGAGATA 

TCATCAAAAACGATTCCTAATTTGTTAGCTTATAAAGATTCATTGGAACATATTAAATCTTATCAGTTGAAGGGTGAA ^ 

GACGCTACTTTATCAGATGGTGGCTCTTATCAAATTTTAACTAAGAAACATCTACTTGCAGTTCAAAATAGAATTAAG 

AAAGAACTGGATAAAAAGCGTAGTAAAACTCTGAAGACAAGCGCGATTCTATATGAAGATTACTATGGTACTACTGCT 

AGTAATGATTCTTCTACTTATTCATCAACAGAAGAGAATAATTATAATACAACACCTTATTCAGAAGCACCACCAAGT 

TACAGTGGTAATACTACTTATAGTTCTGAGACTAATCAAACAACTCATCAAAATTACTATAATAGTAGCACTCCTGCT 

AGTAACTATAGCAGTAACACTAACACAGGTCAGGCTGATTCAAGTGGAAGTGTCAATAATCATAACGGGGCTGCAACG 

CCTAATCCA 

SEQ ID NO. 2505: SAG0368 FROM THE A909 GBS TYPE la STRAIN 

TATAATTTTTCGACTAATG7\ATTGTCTAAGACTTTTAAAGATTTTAAGCTAGCTAAATCAAAAAGTCATGCTATTGAA 
GAAACAAAGCCGTTTTCAATACTATTAATGGGGGTGGACACAGGTTCAGAGCATCGAAAATCTAAGTGGTCAGGAAAT 
AGCGATTCTATGATCTTAGTCACTATAAATCCTAAAACTAATAAAACAACGATGACAAGCTTAGAACGTGACGTATTG 
ATTAAATTGAGTGGTCCCAAAAATAATGGACAGACTGGAGTAGAAGCAAAGCTAAATGCAGCCTATGCTTCTGGTGGT 
GCGGAAAT GGCATT GATGACTGT TCAAGACTTATT AGAT ATT AATGT TGATTACTTTAT GCAAATT AATATGCAAG G A 
TTAGTTGATTTAGTCAATGCTGTTGGTGGTATAACAGTAACTAATAAATTTGACTTTCCAATATCAATTGCTGCCAAT 
GAACCAGAGTACAAGGCTGTTGTTGAACCAGGGACACATAAAATAAATGGAGAACAAGCACTTGTTTATTCTCGTATG 
CGCTATGATGATCCAGAGGGAGATTATGGGCGTCA7U\AAAGACAACGTGAAGTAATTCAAAAAGTCCTTAAAAAAATA 
TTGGCGTTAAATAGTATTAGTTCATACAAAAAAATTCTTTCCGCAGTAAGTAATAACATGCAAACT7VATATTGAGATA 

TCATCAAAAACG AT TCCTAATTTGT TAGCTTAT AAAGAT TCATTGGAAC AT AT TAAATCT TATC AGTTGAAGGGTGAA 
GACGCTACTTTATCAGATGGTGGCTCTTATCAAATTTTAACTAAGAAACATCTACTTGCAGTTCAAAATAGAATTAAG 
AAAGAACTGGATAAAAAGCGTAGTAAAACTCTGAAGACAAGCGCGATTCTATATGAAGATTACTATGGTACTACTGCT 
AGTAATGATTCTTCTACTTATTCATCAACACAAGAGAATAATTATAATACAACACCTTATTCAGAAGCACCACCAAGT 
TACAGTGGTAATACTACTTATAGTTCTGAGACTAATCAAACAACTCATCAAAATTACTATAATAGTAGCACTCCTGCT 
AGTAACT ATAGCAGTAACACTAAC AC AGGTCAGGCTGATTCAAGTG GAAGT GT CAATAATCATAACGGGGCTGCAACG 
CCTAATCCA 

SEQ ID NO. 2506: SAG0368 FROM THE CJB110 GBS NONTYPEABI*E STRAIN {REVERSE 
COMPLEMENT ) 

TATAAT T T TTCG ACT AATGAATTGTCTAAGACTT TTAAAGATTTT AAGCT AGCTAAATCAAAAAGTCATGCTATTGAA 
GAAACAAAGCCGTTTTCAATACTATTAATGGGGGTGGACACAGGTTCAGAGCATCGAAAATCTAAGTGGTCAGGAAAT 
AGCGATTCTATGATCTTAGTCACTATAAATCCTAAAACTAATAAAACAACGATGACAAGCTTAGAACGTGACGTATTG 
ATTAAATTGAGTGGTCCCAAAAATAATGGACAGACTGGAGTAGAAGCAAAGCTAAATGCAGCCTATGCTTCTGGTGGT 
GCGGAAATGGCATTGATGACTGTTCAAGACTTATTAGATATTAATGTTGATTACTTTATGCAAATTAATATGCAAGGA 
TTAGTTGATTTAGTCAATGCTGTTGGTGGTATAACAGTAACTAATAAATTTGACTTTCCAATATCAATTGCTGCCAAT 
GAACCAG AGTACAAGGCTGTT GTTGAACCAGGGACACATAAAATAAATGGAGAACAAGCACT TGT TT ATTCTCGTATG 
CGCTATGATGATCCAGAGGGAGATTATGGGCGTCAAAAAAGACAACGTGAAGTAATTCAAAAAGTCCTTAAAAAAATA 
TTGGCGT T AAAT AGT ATTAGTTCAT ACAAAAAAATTCTT TCCGCAGTAAGT AATAACATGCAAACTAATATTGAGATA 
TCATCAAAAACGATTCCTAATTTGTTAGCTTATAAAGATTCATTGGAACATATTAAATCTTATCAGTTGAAGGGTGAA 
GACGCTACTTTATCAGATGGTGGCTCTTATOVAATTTTAACTAAGAAACATCTACTTGCAGTTCAAAATAGAATTAAG 
AAAGAACTGGATA7VAAAGCGTAGTAAAACTCTGAAGACAAGCGCGATTCTATATGAAGATTACTATGGTACTACTGCT 
AGTAATGATTCTTCTACTTATTCATCAACACAAGAGAATAATTATAATACAACACCTTATTCAGAAGCACCACCAAGT 
TACAGTGGTAATACTACTTATTAGTTCTGAGACTAATCAAACAACTCATCAAAATTACTATAATAGTAGCACTCCTGC 
TAGTAACTATAGCAGTAACACTAACACAGGTCAGGCTGATTCAAGTGGAAGTGTCAATAATCATAACGGGGCTGCAAC 

GCCTAATCCA 
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Table 25: Comparative Sequences relating t SAG0368 
(protein of unknown function) 

SEQ ID NO. 2507: SAG0368 FROM THE COH1 GBS TYPE III STRAIN (REVERSE 

ACAGGTTCAGAGCATCGAAAATCTAAGTGGTCAGGAAATAGCGATTCTATGATCTTAGTCACTATAAATCCTAAAACT 

AATAAAACAACGATGACAAGCTTAGAACGTGACGTATTGATTAAATTGAGTGGTCCCAAAAATAATGGACAGACTGGC 

GTAGAAGCAAAGCTAAATGCAGCCTATGCTTCTGGTGGTGCGGAAATGGCATTGATGACTGTTCAAGACTTATTAGAT 

ATTAATGTTGATTACTTTATGCAAATTAATATGCAAGGATTAGTTGATTTGGTCAATGCTGTTGGTGGTATAACAGTA 

ACTAATAAATTTGACTTTCCAATATCAATTGCTGCCAATGAACCAGAGTACAAGGCTGTTGTTGAACCAGGGACACAT 

AAAATAAATGGAGAACAAGCACTTGTTTATTCTCGTATGCGCTATGATGATCCAGAGGGAGATTAT^ 

AGAQ\ACGTGAAGTAATTCAAAAAGTCCTTAAAAAAATATTGGCGTTAAATAGTATTAGTTCATACAAAAAAATTC^ 

TCCGCAGTAAGTAATAACATGCAAACTAATATTGAGATATCATCAAAAACGATTCCTAATTTGTTAGCTTATAAAGAT 

TCATTGGAACATATTAAATCTTATCAGTTGAAGGGTGAAGACGCTACTCTATCAGATGGTGGCTCTTATCAAATTTTA 

ACTAAGAAACATCTACTTGCAGTTCAAAATAGAATTAAGAAAGAGCTGGATAAAAAGCGTAGTAAAACTCTGAAGACA 

AGCGCGATTCTATATGAAGATTACTATGGTACTACTGCTAGTAATGATTCTTCTACTTATTCATCAACACAAGAGAAT 

JattaSatacaacacccttattcagaagcaccaccaagttacagtggtaatacta 
aacaactcatcaaagttactataatagtagcactcctgctagtaactatagcagtaacactaacacaggtcaggct 

ttcaagtggarg^ttartaattataacggggctgcaacgcctaatcc^ 

AACTAATCCA 

SEO ID NO 2508: SAGO 3 68 FROM THE H36b GBS TYPE lb STRAIN 

TATAATTTTTCGACTAATGAATTGTCTAAGACTTTTAAAGATTTTAAGCTAGCTAAATCAAAAAGTCATGCTATTGAA 
GAA^AAAGCCGTTTTCAATACTATTAATGGGGGTGGACACAGGTTCAGAGCATCGAAARTCTAAGTGGTCA^ 

AG^T^TATGATCTTAGTCACTATAAATCCTAAAACTAATAAAACAACG 
ATTAARTTGAGTGGTCCCAAAAATAATGGACAGACTGGAGTAGAAGCAAAGCTAAATGCAGCC^ 

GCGGAAATGGCATTGATGACTGTTCAAGACTTATTAGATATTAATGTTGATTACTTTATGCAAATTAATATGCAAGGA 

?5agt^atttagtcartgctgttggtggtataacagtaactaataa^^ 

gaKccagagtacaaggctgttgttgaaccagggacacataaaataaatggagaacaag^ 

cgctatgatgattcagagggagattatgggcgtcaaaaaagacaac 

TTGGCGTTAAATAGTA 

t 

SEO ID NO 2509: SAG0368 FROM THE 

TTAGTTCATACAAAAAAATTCTTTCCGCAGTAAGTAATAACATGCAAACTAATATTGAGATATCATCAAAAACGATTC 

atggtgStcttatcaaattttaactaagaaacatctacttc 
agSagtaaLctctgaagacaagcgcgattctatatgaag^^ 

CTTATTCATCAACACAAGAGAATAATTATAATACAACACCTTATTCAGAAGCACCACCAAGTTACAGTGGTAATACTA 
CTTATAGTTCT^AG^CTAATCAAACAACTCATCAAAATTACTATA 

ACACTAACACAGGTCAGGCTGATTCAAGTGGAAGTGTCAATAATCATAACGGGGCTGCAACGCCTAATCCA 

SEQ ID NO. 2510: SAG0368 FROM THE JM9130013 GBS TYPE VIII STRAIN (REVERSE 

TATAATTTTTCGACTAATGAATTGTCTAAGACTTTTAAAGATTTTAAGCTAGCTAAATCAAAAAGTCATGCTATTG 

GAA^CAAAGCCGTTTTCAATACTATTAATGGGGGTGGACACAGGTTCAGAGCATCGAAAATCTAAGTGGTCAGGAAAT 

AGCGATTCTATGATCTTAGTCACTATAAATCCTAAAACTAATAAAACARCGATGA 

a?SSg?ggtcccaa^ 

GCGGAAATGGCATTGATGACTGTTCAAGACTTATTAGA 

?Sgt^t?tagtcaatgctgttggtggtata^^ 

^Sgmgatc^gagggagattatgggcgtcaa^ 

tcatcaaaa^cgattcctaatttgttagcttataaagattcattggaacat 

AAAGAACTGGATAAAAAGCGTAGTAAAACTCTGAAGACAAGCGC 

AGTAATGATTCTTCTACTTATTCATCAACACAAGAGAATAATTATAATACAACACCTTATTCAGAAGCAC 
TACAGTGGTAATACTACTTATAGTTCTGAGACTAATCAAACAACTCATCAAAATTACTATAATAGTAGCACTCCTG 

cctaatcca 
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Table 25: Comparative Sequences relating to SAG036^ 
(protein of unknown function) 

SEQ ID NO. 2511: SAG0368 FROM THE M781 GBS TYPE III STRAIN (REVERSE 
COMPLEMENT) 

TTCAATACTATTAATGGGTGTGGACACAGGTTCAGAGCATCGAAAATCTAAGTGGTCAGGAAATAGCGATTCTATGAT 

CTTAGTCACTATAAATCCTAAAACTAATAAAACAACGATGACAAGCTTAGAACGTGACGTATTGATTAAATTGAGTGG 

TCCCAAAAATAATGGACAGACTGGCGTAGAAGCAAAGCTAAATGCAGCCTATGCTTCTGGTGGTGCGGAAATGGCATT 

GATGACTGTTCAAGACT TATTAGATATT AATGTTG AT T ACT TTATGCAAATTAAT ATGCAAGGATT AGTTGAT TTGGT 

CAATGCTGTTGGTGGTATAACAGTAACTAATAAATTTGACTTTCCAATATCAATTGCTGCCAATGAACCAGAGTACAA 

GGCTGTTGTTGAACCAGGGACACATAAAATAAATGGAGAACAAGCACTTGTTTATTCTCGTATGCGCTATGATGATCC 

AGAGGGAGATTATGGGCGTCAAAAAAGACAACGTGAAGTAATTCAAAAAGTCCTTAAAAAAATATTGGCGTTAAATAG 

TATTAGTTCATACAAAAAAATTCTTTCCGCAGTAAGTAATAACATGCAAACTAATATTGAGATATCATCAAAAACGAT 

TCCTAATTTGTTAGCTTATAAAGATTCATTGGAACATATTAAATCTTATCAGTTGAAGGGTGAAGACGCTACTCTATC 

AGATGGTGGCTCTTATCAAATTTTAACTAAGAAACATCTACTTGCAGTTCAAAATAGAATTAAGAAAGAGCTGGATAA 

AAAGCGTAGTAAAACTCTGAAGACAAGCGCGATTCTATATGAAGATTACTATGGTACTACTGCTAGTAATGATTCTTC 

TACTTATTCATCAACACAAGAGAATAATTATAATACAACACCTTATTCAGAAGCACGACCAAGTTACAGTGGTAATAC 

TACTTATAGTTCTGAGACTAATCAAACAACTCATCAAAGTTACTATAATAGTAGCACTCCTGCTAGTAACTATAGCAG 

TAACACTAACACAGGTCAGGCTGATTCAAGTGGAAGTGTTAATAATTATAACGGGGCTGCAACGCCTAATCCAAACAC 

AGGAACGCAACCAGTACCAGGTCAAACTAATCCA 

SEQ2501 " 

SEQ2502 ~ ~ 

SEQ2503 ~ ~ 

SEQ2504 " ~ 

SEQ2505 

SEQ2506 

SEQ2507 ATTTTAAGCTAGATAAATCAAAAAGTCATGCTATTGAAGAAACAAAGCCGTTTTCAATA 

SEQ2508 " 

SEQ2S09 " 

SEQ2510 ~ 

SEQ2511 TTCAATA 

SEQ2501 ~ ~~ 

SEQ2502 _ ~ 

SEQ2503 _ 

SEQ2504 

SEQ2505 

SEQ2506 

SEQ2507 TATTAATGGGTGTGGACACAGGTTCAGAGCATCGAAAATCTAAGTGGTCAGGAAATAGC 

SEQ2508 

SEQ2509 ~~ 

SEQ2510 

SEQ25X1 TATTAATGGGTGTGGACACAGGTTCAGAGCATCGAAAATCTAAGTGGTCAGGAAATAGC 

SEQ2501 " 

SEQ2502 ~ 

SEQ2503 

SEQ2504 _ 

SEQ2505 

SEQ2506 

SEQ2507 ATTCTATGATCTTAGTCACTATAAATCCTAAAACTAATAAAACAACGATGACAAGCTTA 

SEQ2508 ~ 

SEQ2509 " 

SEQ2510 

SEQ2511 ATTCTATGATCTTAGTCACTATAT^ATCCTAAAACTAATAAAACAACGATGACAAGCTTA 
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Table 25: Comparative Sequences relating to SAG03^^ 
(protein of unknown function) 



SEQ2501 
SEQ2502 
SEQ2503 
SEQ2504 
SEQ2505 
SEQ2506 
SEQ2507 
SEQ2508 
SEQ2509 
SEQ2510 
SEQ2511 

SEQ2501 
SEQ2502 
SEQ2503 
SEQ2504 
SEQ2505 
SEQ2506 
SEQ2507 
SEQ2508 
SEQ2509 
SEQ2510 
SEQ2511 

SEQ2501 
SEQ2502 
SEQ2503 
SEQ2504 
SEQ2505 
SEQ2506 
SEQ2507 
SEQ250B 
SEQ2509 
SEQ2510 
SEQ2511 

SEQ2501 
SEQ2502 
SEQ2503 
SEQ2504 
SEQ2505 
SEQ2506 
SEQ2507 
SEQ2508 
SEQ2509 
SEQ2510 
SEQ2511 

SEQ2501 
SEQ2502 
SEQ2503 
SEQ2504 
SEQ2505 
SEQ2506 
SEQ2507 
SEQ2508 
SEQ2509 
SEQ2510 
SEQ2511 



AACGTGACGTATTGATTAAATTGAGTGGTCCCAAAAATAATGGACAGACTGGCGTAGAA 



AACGTGACGTATTGATTAAATTGAGTGGTCCCAAAAATAATGGACAGACTGGCGTAGAA 



CAAAGCTAAATGCAGCCTATGCTTCTGGTGGTGCGGAAATGGCATTGATGACTGTTCAA 



CAAAGCTAAATGCAGCCTATGCTTCTGGTGGTGCGGA7UVTGGCATTGATGACTGTTCAA 



ACTTATTAGATATTAATGTTGATTACTTTATGCAAATTAATATGCAAGGATTAGTTGAT 



ACTTATT AGAT ATTAATGTTGATTACTTTATGCAAATTAATAT G CAAGGATT AGTTGAT 



TGGTCAATGCTGTTGGTGGTATAACAGTAACTAATAAATTTGACTTTCCAATATCAATT 



TGGTCAATGCTGTTGGTGGTATAACAGTAACTAATAAATTTGACTTTCCAATATCAATT 



CTGCCi\ATGAACCAGAGTACAAGGCTGTTGTTGAACCAGGGACACAT7\AAATAAATGGA 



CTGCCAATGAACCAGAGTACAAGGCTGTTGTTGAACCAGGGACACATAAAATAAATGGA 
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SEQ2501 
SEQ2502 
SEQ2503 
SEQ2504 
SEQ2505 
SEQ2506 
SEQ2507 
SEQ2508 
SEQ2509 
SEQ2510 
SEQ2511 

SEQ2501 
SEQ2502 
SEQ2503 
SEQ2504 
SEQ2505 
SEQ2506 
SEQ2507 
SEQ2508 
SEQ2509 
SEQ2510 
SEQ2511 

8EQ2501 
SEQ2502 
SEQ2503 
SEQ2504 
SEQ2505 
SEQ2506 
SEQ2507 
SEQ2508 
SEQ2509 
SEQ2510 
SEQ2511 

SEQ2501 
SEQ2502 
SEQ2503 
SEQ2504 
SEQ2505 
SEQ2506 
SEQ2507 
SEQ2508 
SEQ2509 
SEQ2510 
SEQ2511 

SEQ2501 
SEQ2502 
SEQ2503 
SEQ2504 
SEQ2505 
SEQ2506 
SEQ2507 
SEQ2508 
SEQ2509 
SEQ2510 
SEQ2511 

SEQ2501 



Table 25: Comparative Sequences relating to SAG03 
(protein of unknown function) 



^CAAGCACTTGTTTATTCTCGTATGCGCTATGATGATCCAGAGGGAGATTATGGGCGT 



AACAAGCACTTGTTTATTCTCGTATGCGCTATGATGATCCAGAGGGAGATTATGGGCGT 

T ATAATT T TTCG 

TATAATTTTTCG 

TATAATTTTTCG 

TATAATTTTTCG 

TATAATTTTTCG 

TATAATTTTTCG 

AAAAAAGACAACGTGAAGTAATTCAAAAAGTCCTTAAAAAAATATTGGCGTTAAATAGT 
TATAATTTTTCG 



TATAATTTTTCG 

AAAAAAGACAACGTGAAGTAATTCAAAAAGTCCTTAAAAAAATATTGGCGTTAAATAGT 

CTAATG AATTGTCTAAGACT TTTAAAGATTT TAAGCT AGCTAAATCAAAAAGTCATGCT 
CTAATGAATTGTCTAAGACTTTTAAAGATTTTAAGCTAGCTAAATCAAAAAGTCATGCT 
CTAATGAATTGTCTAAGACTTTTAAAGATTTTAAGCTAGCTAAATGAAAAAGTCATGCT 
CTAATGAATTGTCTAAGACTTTTAAAGATTTTAAGCTAGCTAAATCAAAAAGTCATGCT 
CTAATGAATTGTCTAAGACTTTTAAAGATTTTAAGCTAGCTAAATCAAAAAGTCATGCT 
CTAATGAATTGTCTAAGACTTTTAAAGATTTTAAGCTAGCTAAATCAAAAAGTCATGCT 
TT AGTTC AT -ACAAAAAAATTCTTTCCGC AGTAAGTAA — TAACATGCAAACTAATATT 
CTAATGAATTGTCTAAGACTTTTAAAGATTTTAAGCTAGCTAAATCAAAAAGTCATGCT 
TTAGTTCAT-ACAAAAAAATTCTTTCCGCAGTAAGTAA — TAACATGCAAACTAATATT 
CTAATGAATTGTCTAAGACTTTTAAAGATTTTAAGCTAGCTAAATCAAAAAGTCATGCT 
TTAGTTCAT-ACAAAAAAATTCTTTCCGCAGTAAGTAA — TAACATGCAAACTAATATT 

TTGAAGAAACAAAGCCGTTTTCAATACTATTAATGGGGGTGGACACAGGTTCAGAGCAT 
TTGAAGAAACAAAGCCGTTTTCAATACTATTAATGGGGGTGGACACAGGTTCAGAGCAT 
TTGAAGAAACAAAGCCGTTTTCAATACTATTAATGGGGGTGGACACAGGTTCAGAGCAT 
TTGAAGAAACAAAGCCGTTTTCAATACTATTAATGGGGGTGGACACAGGTTCAGAGCAT 
TTGAAGAAACAAAGCCGTTTTCAATACTATTAATGGGGGTGGACACAGGTTCAGAGCAT 
TTGAAGAAACAAAGCCGTTTTCAATACTATTAATGGGGGTGGACACAGGTTCAGAGCAT 

AGATATCATCAAAAACGATTCCTAATTTGTTAGCTTATAAAGATTCA TTGGAACAT 

TTGAAGAAACAAAGCCGTTTTCAATACTATTAATGGGGGTGGACACAGGTTCAGAGCAT 

AGATATCATCAAAAACGATTCCTAATTTGTTAGCTTATAAAGATTCA TTGGAACAT 

TTGAAGAAACAAAGCCGTTTTCAATACTATTAATGGGGGTGGACACAGGTTCAGAGCAT 
AGATATCATCAAAAACGATTCCTAATTTGTTAGCTTATAAAGATTCA TTGGAACAT 

GAAAATCTAAGT-GGTCAGGAAATAGCGATTCTATGATCTTAGTCACTATAAATCCTAA 
GAAAATCTAAGTTGGTCAGGAAATAGCGATTCTATGATCTTAGTCACTATAAATCCTAA 
GAAAATCTAAGT-GGTCAGGAAATAGCGATTCTATGATCTTAGTCACTATAAATCCTAA 
GAAAATCTT^AGT-GGTCAGGAAATAGCGATTCTATGATCTTAGTCACTATAAATCCTAA 
GAAAATCTAAGT-GGTCAGGAAATAGCGATTCTATGATCTTAGTCACTATAAATCCTAA 
GAAAATCTAAGT-GGTCAGGAAATAGCGATTCTATGATCTTAGTCACTATAAATCCTAA 
TTAAATCTTATC-AGTTGAAGGGTGAAGACGCTACTCTATCAG — ATGGTGGCTCTTAT 
GAAAATCTAAGT-GGTCAGGAAATAGCGATTCTATGATCTTAGTCACTATAAATCCTAA 
TTAAATCTTATC-AGTTGAAGGGTGAAGACGCTACTTTATCAG — ATGGTGGCTCTTAT 
GAAAATCTAAGT-GGTCAGGAAATAGCGATTCTATGATCTTAGTCACTATAAATCCTAA 
TTAAATCTTATC-AGTTGAAGGGTGAAGACGCTACTCTATCAG — ATGGTGGCTCTTAT 

ACTAATAAAACAACGATGACAAGCTTAGAACGTGACGTATTGATTAAATTGAGTGGTCC 
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SEQ2502 
SEQ2503 
SEQ2504 
SEQ2505 
SEQ2506 
SEQ2507 
SEQ2508 
SEQ2509 
SEQ2510 
SEQ2511 

SEQ2501 
SEQ2502 
SEQ2503 
SEQ2504 
SEQ2505 
SEQ2506 
SEQ2507 
SEQ2508 
SEQ2509 
SEQ2510 
SEQ2511 

SEQ2501 
SEQ2502 
SEQ2503 
SEQ2504 
SEQ2505 
SEQ2506 
SEQ2507 
SEQ2508 
SEQ2509 
SEQ2510 
SEQ2511 

SEQ2501 
SEQ2502 
SEQ2503 
SEQ2504 
SEQ2505 
SEQ2506 
SEQ2507 
SEQ2508 
SEQ2509 
SEQ2510 
SEQ2511 

SEQ2501 
SEQ2502 
SEQ2503 
SEQ2504 
SEQ2505 
SEQ2506 
SEQ2507 
SEQ2508 
SEQ2509 
SEQ2510 
SEQ2511 



Table 25: Comparative Sequences relating t SAG03 
(protein of unknown function) 



ACTiUVTAAAACAACGATGACAAGCTTAGAACGTGACGTATTGATTAAATTGAGTGGTCC 
ACTAATAAAACAACGATGACAAGCTTAGAACGTGACGTATTGATTAAATTGAGTGGTCC 
ACTAATAAAACAACGATGACAAGCTTAGAACGTGACGTATTGATTAAATTGAGTGGTCC 
ACTAATAAAACAACGATGACAAGCTTAGAACGTGACGTATTGATTAAATTGAGTGGTCC 
ACTAATAAAACAACGATGACAAGCTTAGAACGTGACGTATTGATTAAATTGAGTGGTCC 
AAATTTTAACTAAGAAACATCTACTTGCAGTTCAAAATAGAATTAAGAAAGAGCTGGAT 
ACTAATAAAACAACGATGACAAGCTTAGAACGTGACGTATTGATTAAATTGAGTGGTCC 
AAATTTTAACTAAGAAACATCTACTTGCAGTTCAAAATAGAATTAAGAAAGAACTGGAT 
ACTAATAAAACAACGATGACAAGCTTAGAACGTGACGTATTGATTAAATTGAGTGGTCC 
AAATTTTAACTAAGAAACATCTACTTGCAGTTCAA7UVTAGAATTAAGAAAGAGCTGGAT 

AAAAATAATGGACAG ACTGG AGT AGAAGCAAAG — CTAAATGCAGCCT ATGCTTCTGGT 
AAAAATAATGGACAGACTGGCGTAGAAGCAAAG — CTAAATGCAGCCTATGCTTCTGGT 
AAAAATAATGGACAGACTGGAGT AGAAGCAAAG — CTAAATGCAGCCTATGCTTCTGGT 
AAAAATAATGGACAG ACTGGAGTAGAAGCAAAG — CTAAATGCAGCCTATGCTTCTGGT 
AAAAATAATGGACAGACTGGAGTAGAAGCAAAG — CTAAATGCAGCCTATGCTTCTGGT 
AAAAATAATGGACAGACTGGAGT AGAAGCAAAG — CTAAATGCAGCCTATGCTTCTGGT 
AAAAGCGTAGTAAAACTCTGAAGACAAGCGCGATTCTATATGAAGATTACTATGGTACT 
AAAAATAATGGACAGACTGGAGTAGAAGCAAAG — CTAAATGCAGCCTATGCTTCTGGT 
AAAAGCGT AGT AAAACTCTGAAGACAAGCGCGAT T CTAT ATGAAGATTACTATGGT ACT 
AAAAATAATGGACAGACTGGAGTAGAAGC AAAG — CTAAAT GCAGCCTAT GCTTCTGGT 
AAAAGCGTAGTAAAACTCTGAAGACAAGCGCGATTCTATATGAAGATTACTATGGTACT 

GTGC-GGAAATGGCATTGATGACTGTTCAAGACTTATTAGATATTAATGTTGATTACTT 
GTGC-GGAAATGGCAT TGATGACTGTT CAAGACTTATTAGAT ATT AAT GTTGATT ACTT 
GTGC-GGAAATGGCATTGATGACTGTTCAAGACTTATTAGATATTAATGTTGATTACTT 
GTGC-GGAAATGGCATTGATGACTGTTCAAGACTTATTAGATATTAATGTTGATTACTT 
GTGC-GGAAATGGCAT TGATGACTGT TCAAGACT T ATTAGATATTAATGTTGATT ACT T 
GTGC-GGAAATGGCATTGATGACTGTTCAAGACTTATTAGATATTAATGTTGATTACTT 
CTGCTAGTAATGATTCTTCTACTTATTCATCAAC-ACAAGAGAATTATTATTAT-ACAA 
GTGC -GGAAATGGCATTGATGACTGT TCAAGACT TATTAGATATT AATGTTGATT ACT T 
CTGCTAGTAATGATTCTTCTACTTATTCATCAAC-ACAAGAGAATAATTATAAT-ACAA 
GTGC-GGAAATGGCATTGATGACTGTTCAAGACTTATTAGATATTAATGTTGATTACTT 
CTGCTAGTAATGATTCTTCTACTTATTCATCAAC— ACAAGAGAATAATTATAAT-ACAA 

ATGCAAATTAATATGCAAGGATTAGTTGATTTAGTCAATGCTGTTGGTGGTATAACAGT 
ATGCAAATTAAT ATGCAAGG ATTAGTTG AT TT AGTCAAT GCTGTTGGTGGTATAACAGT 
ATGCAAATTAATATGCAAGGATTAGTTGATTTAGTCAATGCTGTTGGTGGTATAACAGT 
ATGCAAAT TAATATGCAAGGATTAGTT GATTTAGTCAATGCTGTTGGTGGTATAAC AGT 
ATGCAAATTAATATGCAAGGATTAGTTGATTTAGTCAATGCTGTTGGTGGTATAACAGT 
ATGCAAATTAATATGCAAGGATTAGTTGATTTAGTCAATGCTGTTGGTGGTATAACAGT 

ACCCTTATTCAGAAGCACCACCAAGTTACAGTGGT-AATACTACTTATAGTT CTGA 

ATGCAAATTAATATGCAAGGATTAGTTGATTTAGTCAATGCTGTTGGTGGTATAACAGT 

ACC-TTATTCAGAAGCACCACCAAGTTACAGTGGT-AATACTACTTATAGTT CTGA 

ATGCAAATTAATATGCAAGGATTAGTTGATTTAGTCAATGCTGTTGGTGGTATAACAGT 
ACC-TTATTCAGAAGCACCACCAAGTTACAGTGGT-AATACTACTTATAGTT CTGA 

ACTAATAAATTTGACTTTCCAATATCAATTGCTGCCAATGAACCAGAGTACAAGGCTGT 
ACTAATAAATTTGACTTTCCAATATCAATTGCTGCCAATGAACCAGAGTACAAGGCTGT 
ACTAATAAATTTGACTTTCCAATATCAATTGCTGCCAATGAACCAGAGTACAAGGCTGT 
ACTAATAAATTTGACTTTCCAATATCAATTGCTGCCAATGAACCAGAGTACAAGGCTGT 
ACTAATAAATTTGACTTTCCAATATCAATTGCTGCCAATGAACCAGAGTACAAGGCTGT 
ACTAATAAATTTGACTTTCCAATATCAATTGCTGCCAATGAACCAGAGTACAAGGCTGT 

ACTAATCAAAC-AACTCATCAA AGTT ACT AT -AATAG- - TAGCACT CCTGCT AGT 

ACTAATAAATTTGACTTTCCAATATCAATTGCTGCCAATGAACCAGAGTACAAGGCTGT 

ACTAATCAAAC-AACTCATCAA AATTACTAT-AATAG — TAGCACT CCTGCT AGT 

ACTAATAAATTTGACTTTCCAATATCAATTGCTGCCAATGAACCAGAGTACAAGGCTGT 
ACTAATCAAAC-AACTCATCAA AGTTACTAT-AATAG- -TAGCACTCCTGCT AGT 
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SEQ2501 
SEQ2502 
SEQ2503 
SEQ2504 
SEQ2505 
SEQ2506 
SEQ2507 
SEQ2508 
SEQ2509 
SEQ2510 
SEQ2511 

SEQ2501 
SE02502 
SEQ2503 
SEQ2504 
SEQ2505 
SEQ2506 
SEQ2507 
SEQ2508 
SEQ2509 
SEQ2510 
SEQ2511 

SEQ2501 
SEQ2502 
SEQ2503 
SEQ2504 
SEQ2505 
SEQ2506 
SEQ2507 
SEQ2508 
SEQ2509 
SEQ2510 
SEQ2511 

SEQ2501 
SEQ2502 
SEQ2503 
SEQ2504 
SEQ2505 
SEQ2506 
SEQ2507 
SEQ2508 
SEQ2509 
SEQ2510 
SEQ2511 

SEQ2501 
SEQ2502 
SEQ2503 
SEQ2504 
SEQ2505 
SEQ2506 
SEQ2507 
SEQ2508 
SEQ2509 
SEQ2510 
SEQ2511 



Table 25: Comparative Sequences relating to SAG036^^ 
(protein of unknown function) 

GTTGAACCAGGGACACATAAAATAAATGGAGAACAAGCACTTGTTTATTCTCGTATGCG 

GTTGAACCAGGGACACATAAAATAAATGGAGAACAAGCACTTGTTTATTCTCGTATGCG 

GTTGAACCAGGGACACATAAAAT AAATGGAGAAC AAGCACT TGT TTATTCTCGT ATGCG 

GTTGAACCAGGGACACATAAAATAAATGGAGAACAAGCACTTGTTTATTCTCGTATGCG 

GTTGAACCAGGGACACATAAAATAAATGGAGAACAAGCACTTGTTTATTCTCGTATGCG 

GTTGAACCAGGGACACATAAAATAAATGGAGAACAAGCACTTGTTTATTCTCGTATGCG 

ACTATAGCAGTAACAC-TAACACAGGTCAGGCTGATTCAAGTGGAAGTGTTAATAATTA 

GTTGAACCAGGGACACATAAAATAAATGGAGAACAAGCACTTGTTTATTCTCGTATGCG 

ACTATAGCAGTAACAC-TAACACAGGTCAGGCTGATTCAAGTGGAAGTGTCAATAATCA 

GTTGAACCAGGGACACATAAAATAAATGGAGAACAAGCACTTGTTTATTCTCGTATGCG 

ACT ATAGCAGTAAC AC- T AACACAGGTCAGGCTGATTCAAGTGGAAGTGT TAAT AATT A 

TATGATGATCCAGAGGGAGATTATGGGCGTCAAAAAAGACAACGTGAAGTAATTCAAAA 
TAT GATGAT CCAGAGGGAGATTATGGGCGTCAAAAAAGACAACGTGAAGTAATTC AAAA 
TATGATGATCCAGAGGGAGATTATGGGCGTCAAAAAAGACAACGTGAAGTAATTCAAAA 
TATGATGATCCAGAGGGAGATTATGGGCGTCAAAAAAGACAACGTGAAGTAATTCAAAA 
TATGATGATCCAGAGGGAGATTATGGGCGTCAAAAAAGACAACGTGAAGTAATTCAAAA 
TATGATGATCCAGAGGGAGATTATGGGCGTCAAAAAAGACAACGTGAAGTAATTCAAAA 
AACGGGGCTGCAACGCCTAATCCAAACACAGGAACGCAACCAGTACCAGGTCAAACTAA 
TATGATGATCCAGAGGGAGATTATGGGCGTCAAAAAAGACAACGTGAAGTAATTCAAAA 

AACGGGGCTGCAACGCCTAATCCA - — — — — —————— — — — — — — — — — 

TATGATGATCCAGAGGGAGATTATGGGCGTCAAflAAAGACAACGTGAAGTAATTCAAAA 
AACGGGGCTGCAACGCCTAATCCAAACACAGGAACGCAACCAGTACCAGGTCAAACTAA 

GTCCTTAAAAAAATATTGGCGTTARATAGTATTAGTTCATACAAAAAAATTCTTTCCGC 
GTCCTTAAAAAAATATTGGCGTTAAATAGTATTAGTTCATACAAAA7\AATTCTTTCCGC 
GTCCTTAAAAAAATATTGGCGTTAAATAGTATTAGTTCATACAAAAARATTCTTTCCGC 
GTCCTTAAAAAAATATTGGCGTTAAATAGTATTAGTTCATACAAARAftATTCTTTCCGC 
GTCCTTAAAAAAATATTGGCGTTAaATAGTATTAGTTCATACAAAAAAATTCTTTCCGC 
GTCCTTARAAAAATATTGGCGTTAAATAGTATTAGTTCATACAAARAAATTCTTTCCGC 

CCA 

GTCCTTAAAAAAATATTGGCGTTAAATAGTA 



GTCCTTAAAAAAATATTGGCGTTAARTAGTATTAGTTCATACAAAAARATTCTTTCCGC 

CCA 

GTARGTAATAACATGCAAACTAATATTGAGATATCATCAAAAACGATTCCTAATTTGTT 
GTAAGTAATAACATGCAAACTAATATTGAGATATCATCAAAAACGATTCCTAATTTGTT 
GTAAGTAATAACATGCAAACTAATATTGAGATATCATCAAAAACGATTCCTAATTTGTT 
GTAAGTAATAACATGCAAACTAATATTGAGATATCATCAAAAACGATTCCTAATTTGTT 
GTAAGTAATAACATGCAAACTAATATTGAGATATCATCAAAAACGATTCCTAATTTGTT 
GTAAGTAATAACATGCAAACTAATATTGAGATATCATCAAAAACGATTCCTAATTTGTT 



GTAAGTAATAACATGCAAACTAATATTGAGATATCATCAAAAACGATTCCTAATTTGTT 



GCTTATAAAGATTCATTGGAACATATTAAATCTTATCAGTTGAAGGGTGAAGACGCTAC 
GCTTATAAAGATTCATTGGAACATATTAAATCTTATCAGTTGAAAGGTGAAGACGCTAC 
GCTTATAAAGATTCATTGGAACATATTAAATCTTATCAGTTGAAGGGTGAAGACGCTAC 
GCTTATAAAGATTCATTGGAACATATTAAATCTTATCAGTTGAAGGGTGAAGACGCTAC 
GCTTATAAAGATTCATTGGAACATATTAAATCTTATCAGTTGAAGGGTGAAGACGCTAC 
GCTTATAAAGATTCATTGGAACATATTAAATCTTATCAGTTGAAGGGTGAAGACGCTAC 



GCTTATAAAGATTCATTGGAACATATTAAATCTTATCAGTTGAAGGGTGAAGACGCTAC 
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Table 25: Comparative Sequences relating to SAG036< 
(protein f unknown function) 




little**! 



SEQ2501 
SEQ2502 
SEQ2503 
SEQ2504 
SEQ2505 
SEQ2506 
SEQ2507 
SEQ2508 
SEQ2509 
SEQ2510 
SEQ2511 

SEQ2501 
SEQ2502 
SEQ2503 
SEQ2504 
SEQ2505 
SEQ2506 
SEQ2507 
SEQ2508 
SEQ2509 
SEQ2510 
SEQ2511 

SEQ2501 
SEQ2502 
SEQ2503 
SEQ2504 
SEQ2505 
SEQ2506 
SEQ2507 
SEQ2508 
SEQ2509 
SEQ2510 
SEQ2511 

SEQ2501 
SEQ2502 
SEQ2503 
SEQ2504 
SEQ2505 
SEQ2506 
SEQ2507 
SE02508 
SEQ2509 
SEQ2510 
SEQ2511 

SEQ2501 
SEQ2502 
SEQ2503 
SEQ2504 
SEQ2505 
SEQ2506 
SEQ2507 
SEQ2508 
SEQ2509 
SEQ2510 
SEQ2511 



TTATCAGATGGTGGCTCTTATCAAATTTTAACTAAGAAACATCTACTTGCAGTTCAAAA 
TTATCAGATGGTGGCTCTTATCAAATTTTAACTAAGAAACATCTACTTGCAGTTCAAAA 
TTATCAGATGGTGGCTCTTATCAAATTXTAACTAAGAAACATCTACTTGCAGTTCAAAA 
TTATCAGATGGTGGCTCTTATCAAATTTTAACTAAGAAACATCTACTTGCAGTTCAAAA 
TTATCAGATGGTGGCTCTTATCAAATTTTAACTAAGAAACATCTACTTGCAGTTCAAAA 
TTATCAGATGGTGGCTCTTATCAAATTTTAACTAAGAAACATCTACTTGCAGTTCAAAA 



TTATCAGATGGTGGCTCTTATCAAATTTTAACTAAGAAACATCTACTTGCAGTTCAAAA 



AGAATTAAGAAAGAACTGGATAAAAAGCGTAGTAAAACTCTGAAGACAAGCGCGATTCT 
AGAATTAAGAAAGAACTAGATAAAAAGCGTAGTAAAACTCTGAAGACAAGCGCGATTCT 
AGAATTAAGAAAGAACTGGATAAAAAGCGTAGTAAAACTCTGAAGACAAGCGCGATTCT 
AGAATTAAGAAAGAACTGGATAAAAAGCGTAGTAAAACTCTGAAGACAAGCGCGATTCT 
AGAATTAAGAAAGAACTGGATAAAAAGCGTAGTAAAACTCTGAAGACAAGCGCGATTCT 
AGAATTAAGAAAGAACT GG ATAAAAAGCGTAGTAAAACTCTGAAGACAAGCGCGATT CT 



AGAATTAAGAAAGAACTGGATAAAAAGCGTAGTAAAACTCTGAAGACAAGCGCGATTCT 



TATGAAGATTACTATGGTACTACTGCTAGTAATGATTCTTCTACTTATTCATCAACACA 
TATGAAGATTACTATGGTACTACTGCTAGTAATGATTCTTCTACTTATTCATCAACACA 
TATGAAGATTACTATGGTACTACTGCTAGTAATGATTCTTCTACTTATTCATCAACACA 
TATGAAGATTACTATGGTACTACTGCTAGTAATGATTCTTCTACTTATTCATCAACACA 
T ATGAAGATT ACTATGGTACT ACTGCTAGTAATGAT TCT TCTACTT ATTCATCAAC ACA 
T ATGAAGATTACTATGGTACT ACTGCTAGTAATGATTCTT CTACT T ATTCATCAAC ACA 



TAT G AAGATT ACTATGGT ACTACTGCTAGT AATGATTCT TCT ACTT ATTC ATCAACACA 



GAGAATAATTATAATACAACACCTTATTCAGAAGCACCACCAAGTTACAGTGGTAATAC 
GAGAAT AATT ATAATACAACACCTTAT TC AGAAGCACCACCAAGTTAC AGT GGTAATAC 
GAGAATAATTATAATACAACACCTTATTCAGAAGCACCACCAAGTTACAGTGGTAATAC 
GAGAATAATTATAATACAACACCTTATTCAGAAGCACCACCAAGTTACAGTGGTAATAC 
GAGAATAATTATT^ATACAACACCTTATTCAGAAGCACCACCAAGTTACAGTGGTAATAC 
GAGAATAATTATAATACAACACCTTATTCAGAAGCACCACCAAGTTACAGTGGTAATAC 



GAGAATAATTATAATACAACACCTTATTCAGAAGCACCACCAAGTTACAGTGGTAATAC 



ACTTAT-AGTTCTGAGACTAATCAAACAACTCATCAAAATTACTATAATAGTAGCACTC 
ACTTAT-AGTTCTGAGACTAATCAAACAACTCATCAAAGTTACTATAATAGTAGCACTC 
ACTTAT-AGTTCTGAGACTAATCAAACAACTCATCAAAATTACTATAATAGTAGCACTC 
ACTT AT- AGT T CTGAGACT AATCAAACAACTCATCAAAATTACT ATAATAGTAGCACTC 
ACT TAT -AGTTCTGAGACTAATCAAACAACTCATCAAAATTACTATAAT AGTAGCACTC 
ACTTATTAGTTCTGAGACTAATCAAACAACTCATCA/^AATTACTATAATAGTAGCACTC 



ACTTAT-AGTTCTGAGACTAATCAAACAACTCATCAAAATTACTATAATAGTAGCACTC 
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&uUH*Ui*&* "Mm- £1 & y £ 
Table 25: Comparative Sequences relating to SAG03< 
(protein of unknown function) 

SE02S01 tgcTAGTAACTATAGCAGTAACACTAACACAGGTCAGGCTGATTCAAGTGGAAGTGTCA 

SEO2502 ' TGCTAATAACTATAGCAGTAACACTAACACAGGTCAGGCTGATTCAAGTGGAAGTGTCA 

SEO2503 TGCTAGTAACTATAGCAGTAACACTAACACAGGTCAGGCTGATTCAAGTGGAAGTGTCA 

SEQ2504 TGCTAGTAACTATAGCAGTAACACTAACACAGGTCAGGCTGATTCAAGTGGAAGTGTCA 

SEQ2505 TGCT AGTAACTAT AGCAGT AACACTAACAC AGGTCAGGCTGAT TCAAGT GGAAGTGTCA 

SEQ2506 TGCTAGTAACTATAGCAGTAACACT^^ 

SEQ2507 ~ ~ " ZJ1ZZ 

seq2508 " "ZZZZZZZZZZZZZZZZZZZZZ 

SEQ2510 TCCTAGT^CT^^ 

SEQ2511 

SEQ2501 TAATCATAACGGGGCTGCAACGCCTAATCCA 

SEQ2502 TAATCATAATGGGGCTGCAACGCCTAATCCA 

SEQ2503 TAATCATAACGGGGCTGCAACGCCTAATCCA 

SEQ2504 TAATCATAACGGGGCTGCAACGCCTAATCCA 

SEQ2505 TAATCATAACGGGGCTGCAACGCCTAATCCA 

SEQ2506 TAATCATAACGGGGCTGCAACGCCTAATCCA 

SEQ2507 

SEQ2508 ~™~ 

SEQ2509 
SEQ2510 
SEQ2511 



TAATCATAACGGGGCTGCAACGCCTAATCCA 



>SEQ ID HO 2550: 54 090 frame: 1 

YNFSTNELSKTFKDFKIAK^KSHAIEETKPFSIIiLMGVDTGSEHRKSKWSGNSDSMILVT 
IN PKTNKTTMT S I,ERDVLIKIjSGPKNNGQTGVE AKLN AAV ASGGAEMALMT VQDLLD INV 
DYFMQINMQGLVDLVNAVGGITVTNKFDFPI S I AANEPEYKAVVEPGTHKINGEQALVYS 
RMRYDDPEGDYGRQKRQREVIQKVLKKIIiALNS I S SYKKI LSAVSNNMQTNIE ^55^^^ 
NLIAYKDSLEHIKSYQLKGEDATLSDGGSYQILTKKHLIAVQNRIKKELDKKRSKT^S 
AILYEDYYGTTASNDSSTYSSTQENNYNTTPYSEAPPSYSGNTTYSSETNQTTHQNYYNS 

STPASNYSSNTNTGQADSSGSVNNHNGAATPNP 
>SEQ ID HO 2551:54 1169NT frame: 1 

YNFSTNELSKTFKDFKLMCSKSHAIEETKPFSILLMGVDTGSEHRKSKLVRK . RFYDLSH 
YKS N NNDDKLRT . RID . IEWSQK . WTDWRRSKAKCSLCFWWCGNGIDDCSRLIRY . C 

LLYAN • YARIS . FSQCCWWYNSN . . I . LSNINCCQ . TRVQGCC . TRDT . NKWRTSTCLF 
SYAL SRGRLWASKKTT . SNSKSP . KNIGVK. Y. FIQKNSFRSK* • HAN . Y . DIIKNDS 
FVSl/. RFIGTY- ILSVER. RRYFIRWWLLSNFN ■ ETSTCSSK.N . ERTR.KA. . NSEDK 
RDSI . RLLWYYC . . . FFYLFINTRE . I» . YNTLFRSTTKLQW • YYL . F.D. SNNSSKLL . . 

. HSC. .L.Q.H. HRS G . FKWKCQ - S . WGCN A . S 

>SEQ ID NO 2552:54 18RS21 frame: 1 

YNFSTNELSKTFKDFKI^SKSHAJiEETKPFSILIJSlGVDTGSEHRKSKWSGNSDSMILVT 

INPKTNKTTmSLERDVLIKLSGPKNNGQTGVEAKI^AAYASGGAEMAl^TVQDLLDINV 
DYFMQINMQGLVDLVNAVGGITVTNKFDFPI S I AANEPEYKAWEPGTHKINGEQALVYS 
R^YDDPEGDYGRQKRQREVIQKVLKKIIALNS I SSYKBCILSAVSNNMQTN IE I S SKT I P 
NLLAYKDSLEHIKSYQLKGEDATLSDGGSYQILTKKHLIAVQNRIKKELDKKRSKTLKTS 
AILYEDYYGTTASNDSSTYSSTQENNYNTTPYSEAPPSYSGNTTYSSETNQTTHQNYYNS 
STPASNYSSNTNTGQADSSGSVNNHNGAATPNP 

>SEQ ID NO 2553:54 2603 frame: 1 

YNFSTNELSCTFKDFKLMSKSHAIEETKPFSILI^GVDTGSEHRKSKWSGNSDSMILVT 
' INPKTNKTTMTSLERDVLIKLSGPKNNGQTGVEAKLNAAYASGGAEMALMTVQDLLDINV 
DYFMOINMQGLVDLVNAVGGITVTNKFDFPISIAANEPEYKAWEPGTHKINGEQALVYS 

RMRYDDPEGDYGRQKRQREVIQKVLKKIIALNS I S S YKKILSAVSNNMQTN IEI S SKTI P 
NLLAYKDSLEHIKSYQLKGEDATLSDGGSYQILTKKHLLAVQNRIKKELDKKRSKTMCTS 
AILYEDYYGTTASNDSSTYSSTQENNYNTTPYSEAPPSYSGNTTYSSETNQTTHQNYYNS 

ST PASNY S SNTNTGQAD S SGSVNNHNGAAT PN P 
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Table 25: Comparative Sequences relating to SAG03f 
(protein of unknown function) 

AILYEDYYGTTASNDSSTYSSTQENNYNTTPYSEAPPSYSGNTTYSSETNQTTHQNYYNS 
STPASNYSSNTNTGQADSSGSVNNHNGAATPNP 



>SEQ ID NO 2555:54JCJB110 frame: 1 

YNFSTNELSKTEKDFKLAKSKSHAIEETKPFSILLMGVDTGSEHRKSKWSGNSDSMIIiVT 
IN PKTNKTTMT SLERDVLIKLSGPKiTOGQTGVEAKLN AAYASGGAEMALMTVQDLI.DINV 
DYEWQII^QGLVDLVIUVVGGITVTNKFDFPISIAANEPEYKAWEPGTHKINGEQALVYS 
RMRYDDPEGDYGRQKRQREVIQKVLKKI LALNS I S S YKKI LSAVSNNMQTN IEI S SKT I P 
NLLAYKDSLEHIKSYQI^GEDATLSDGGSYQILTKKHI^VQmiKKELDKKRSKTIJa'S 
AILYEDYYGTTASNDSSTYSSTQENNYNTTPYSEAPPSYSGNTTY. F.D.SNNSSKLL. . 

>SEQ ID NO 2556:54 COHl frame: 1 

DFKLDKSKSHAIEETKPF^ILLMGVDTGSEHRKSKWSGNSDSMILVTINPKTNKTTMTSL 
ERDVLIKLSGPKNNGC2TGVEAKLNAAYASGGAEMALMTVQDLLDINVDYFMQINMQGLVD 
LVNAVGGITVTNKFDFPISIAANEPEYKAWEPGTHKINGEQALVYSRMRYDDPEGDYGR 
QKRQREVIQKVLKKILALNSISSYKKILSAVSNNMQTNIEISSKTIPNLLAYKDSLEHIK 
S YQLKGEDATLS DGGSYQI LTKKHLLAVQNRIKKELDKKRSKTLKTS AI LYEDYYGTTAS 
NDSSTYSSTQENYYYTTPLFRSTTKLQW . YYI* . F . D . SNNSSKLL * . .HSC. .L.Q.H.H 
RSG . FKWKC - .L. RGCNA . SKHRNATSTRSN , S 

>SEQ ID NO 2557:54_H36B frame: 1 

YNFSTNELSKTFKDFKLAKSKSHAIEETKPFSILLMGVDTGSEHRKSKWSGNSDSMILVT 
IN PKTNKTTMT SLERDVLIKLSGPKNNGQTGVE AKLNAAYASGGAEMALMTVQDLI#DINV 
DYFMQINMQGLVDLVNAVGGITVTNKFDFPI S I AANEPEYKAWE PGTHKINGEQALVYS 
RMRYDDPEGDYGRQKRQREVIQKVLKKIIiALNSISSYKKILSAVSNNMQTNIEISSKTIP 
NLLAYKDSI£HIKSYQIiKGEDATI»SDGGSYQIIiTKKHIjIiAVQNRIKKELDKKRSKTLCT 
AILYEDYYGTTASNDSSTYSSTQENNYNTTPYSEAPPSYSGNTTYSSETNQTTHQNYYNS 
STPASNYSSNTNTGQADS SGSVNNHNGAATPNP 

>SEQ ID NO 2558:54_0M9130013 frame: 1 

YNFSTNELSKTFKDFKLAKSKSHAIEETKPFSILLMGVDTGSEHRKSKWSGNSDSMILVT 
INPKTNKTTMTSI^RDVLIKI^GPKNNGQTGVEAKIiNAAYASGGAEMAIiMTVQDLLDIW 
DYFMQINMQGLVDLVNAVGGIT VTNKFDFPI S I AANEPEYKAWEPGTHKINGEQALVYS 
RMRYDDPEGDYGRQKRQREVIQKVLKKIIiALNSISSYKKILSAVSNNMQTNIEISSKTIP 
NLIJVYKDSLEHIKSYQLKGEDATLSDGGSYQII.TKKHLIAVQNRIKKELDKKRSKTI1KTS 
AII»YEDYYGTTASN DSST YSSTQENNYNTT PYSEAPPS YSGNTTYS SETNQTTHQNYYNS 
STPASNYSSNTNTGQADSSGSVNNHNGAATPNP 

>SEQ ID NO 2559:54_M761 frame: 2 

SILIlMGVDTGSEHRKSKWSGNSDSMILVTINPKTNKTT^5TSLERDVIlIKLSGPKNNGQTG 
VEAKLNAAYASGGAEMALMTVQDLLDINVDYFMQINMQGLVDLVNAVGG ITVTNKFD FPI 
SIAANEPEYKAVVEPGTHKINGEQALVYSRMRYDDPEGDYGRQKRQREVIQKVLKKILAL 
NSISSYKKILSAVSNNMQTNIEISSKTIPNLIAYKDSLEHIKSYQLKGEDATLSDGGSYQ 
ILTKKHLIAVQNRIKKELDKKRSKTIJKTSAILYEDYYGTTASN DS STYS STQENNYNTTP 
YSEAPPSYSGNTTYSSETNQTTHQSYYNSSTPASNYSSNTNTGQADSSGSVNNYNGAATP 
NPNTGTQPVPGQTNP 



SEQ2550 NFSTNELSKTFKDFKLAKSKSHAIEETKPFSILLMGVDTGSEHRKSKWSGNSDSMIIiVT 

SEQ2551 NFSTNELSKTFKDFKLAKSKSHAIEETKPFSILLMGVDTGSEHRKSKLVRKRFYDLSHY 

SEQ2552 NFSTNELSKTFKDFKLAKSKSHAIEETKPFSILLMGVDTGSEHRKSKWSGNSDSMILVT 

SEQ2553 NFSTNELSKTFKDFKIJaCSKSHAIEETKPFSILLMGVDTGSEHRKSKWSGNSDSMILVT 

SEQ2554 NFSTNELSKTFKDFKLAKSKSHAIEETKPFSILLMGVDTGSEHRKSKWSGNSDSMILVT 

SEQ2555 NFSTNELSKTFKDFKLAKSKSHAIEETKPFSILLMGVDTGSEHRKSKWSGNSDSMILVT 

SEQ2556 DFKLDKSKSHAIEETKPFSILLMGVDTGSEHRKSKWSGNSDSMILVT 

SEQ2557 NFSTNELSKTFKDFKIAKSKSHAIEETKPFSILLMGVDTGSEHRKSKWSGNSDSMILVT 

SEQ2558 NFSTNELSKTFKDFKLAKSKSHAIEETKPFSILLMGVDTGSEHRKSKWSGNSDSMILVT 

SEQ2559 SILLMGVDTGSEHRKSKWSGNSDSMILVT 

SEQ2550 NPKTNKTTMTSLERDVLIKLSGPKNNGQTGVEAKLN^ 

SEQ2551 SNNNDDKLRTRIDIEWSQKWTDWRRS KAKCSLCFWWCGNGIDDCSRLIRYCLLY 

SEQ2552 NPKTNKTTMTSI^RDVLIKLSGPKNNGQTGVEAKLN^ 

SEQ2 553 NPKTNKTTMTSI^RDVLIKLSGPKNNGQTGVEAKL^^ 

SEQ2554 NPKTNKTTMTSLERDVLIKLSGPKNNGQTGVEAKLNAAYASGGAEMALMTVQDLLDINV 

SEQ2555 NPKTOKTTMTSLERDVIilKLSGPKNNGQTGVEAKLNAAYASGGAEMALMTVQDLLDINV 

• SEQ2556 NPKTOKTTMTSLERDVLIKLSGPKNNGQTGVEAKLNAAYASGGAEMALMTVQDLLDINV 

SEQ2557 NPKTNKTTMTSLERDVLIKLSGPKNNGOTGVEAKLNAAYASGGAEMALMTVQD 
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Table 25: Comparative Sequences relating to SAGO: 
(protein of unknown function) 




SEQ2558 
SEQ2559 

SEQ2550 
SEQ2551 
SEQ2552 
SEQ2553 
SEQ2554 
SEQ2555 
SEQ2556 
SEQ2557 
SEQ2558 
SEQ2559 

SEQ2550 
SEQ2551 
SEQ2552 
SEQ2553 
SEQ2554 
5EQ2555 
SEQ2556 
SEQ2557 
SEQ2558 
SEQ2559 

SEQ2550 
SEQ2551 
SEQ2552 
SEQ2553 
SEQ2554 
SEQ2555 
SEQ2556 
SEQ2557 
SEQ2558 
SEQ2559 

SEQ2550 
SEQ2551 
SEQ2552 
SEQ2553 
8EQ2554 
SEQ2555 
SEQ2556 
SEQ2557 
SEQ2558 
SEQ2S59 

SEQ2550 
SEQ2551 
SEQ2552 
SEQ2553 
SEQ2554 
SEQ2555 
SEQ2556 
SEQ2557 
SEQ2558 
SBQ2559 



NPKTNKTTMTSLERDVLIKLSGPKNNGQTGVEIAKLNAAYASGGAEMAIjMTVQDLLDINV 
NPKTNKTTMTSLERDVLIKLSGPKNNGQTGVEAKLNAAYASGGAEMALMTVQDLLD 

Y FMQINMQGLVDLVNAVGGITVTNKFDFPI 9 IAANE PEYKAWE PGTHKINGEQALVYS 

NYARISFSQCCWWYNS NILSNINCCQTRVQGCCTRDTNKWRTSTCLFSY 

YraQimQGLVDLVNAVGGITVTNKFDFPISIATttJEPEYBCAVVEPGTHKINGEQALVYS 
YFMQINMQGLVDLVNAVGGITVTNKFDFPI S IAANE PEYKAWE PGTHKINGEQALVYS 
YFMQINMQGLVDLVNAVGGITVTNKFDFPI S I AANE PEYKAWE PGTHKINGEQALVYS 
YFMQINMQGLVDLVNAVGGITVTNKFDFPISIAANEPEYKAWEPGTHKINGEQALVYS 

Y FMQINMQGLVDLVNAVGGITVTNKFDFPI S I AANE PEYKAWE PGTHKINGEQALVY S 
YFMQINMQGLVDLVNAVGGITVTNKFDFPISIAANEPEYKAWEPGTHKINGEQALVYS 

Y FMQINMQGLVDLVNAVGGITVTNKFDFPI S I AANE PEYKAWE PGTHKINGEQALVYS 

Y FMQINMQGLVDLVNAVGGITVTNKFDFPI S IAANE PEYKAWE PGTHKINGEQALVYS 

MRYDDPEGDYGRQKRQREVIQKVLKKILALNSISSYKKILSAVSNNMQTNIEISSKTIP 
LSRGRLWASKKTTSNSKSPKNIGVKYFIQKNSFRSKHANYDIIKNDSFVSLRFIGTYI- 
MRYDDPEGDYGRQKRQREVIQKVLKKILALNSISSYKKILSAVSNNMQTNIEISSKTIP 
MRYDDPEGDYGRQKRQREVIQKVLKKILALNS I S S YKKI LS AVSNNMQTNIEI S SKTIP 
MRYDDPEGDYGRQKRQREVIQKVLKKILALNSISSYKKILSAVSNNMQTNIEISSKTIP 
MRYDD PEG DYGRQKRQREVIQKVLKKIL ALNS I SS YKKILS AVSNNMQTNI E I S SKT I P 
MRYDDPEGDYGRQKRQREVIQKVLKKILALNSISSYKKILSAVSNNMQTNIEISSKTIP 
MRYDDPEGDYGRQKRQREVIQKVLKKILALNSISSYKKILSAVSNNMQTNIEISSKTIP 
MRYDDPEGDYGRQKRQREVIQKVLKKILALNSISSYKKILSAVSNNMQTNIEISSKTIP 
MRYDDPEGDYGRQKRQREVIQKVLKKILALNS I S SYKKILSAVSNNMQTNIEI S SKTI P 

LLAYKDSLEHIKSYQLKGEDATLSDGGSYQILTKKHLLAVQNRIKECELDKKRSKTLKTS 
L-SVERRRYFIRWMLLSNFNETSTCSSKNERTRKANSEDKRDSIRLLWYYCFFYLFINT 
LLAYKDSLEHIKSYQI^GEDATLSDGGSYQILTKKHLLAVQNRIKKELDKKRSKTLKTS 
LLAYKDSLEHIKSYQLKGEDATLSDGGSYQILTKKHLLAVQNRIKKELDKKRSKTLKTS 
LIAYKDSLEHIKSYQLKGEDATLSDGGSYQILTKBCHLLAVQNRIKKELDKKRSKTLKTS 
LLAYKDS LEHIKS YQLKGEDAT LS DGGS YQI LTKKHLLAVQNRIKKELDKKRS KTLKT S 
LIAYKDSLEHIKSYQLKGEDATLSDGGSYQILTKKHLLAVQNRIKKELDKKRSKTLKTS 
LIAYKDSLEHIKSYQLKGEDATLSDGGSYQILTKKHLIAVQNRIKKELDKKRSKTLKTS 
LIJ\YKDSLEHIKSYQLKGEDATLSDGGSYQILTKKHLLAVQNRIKKELDKKRSKTLKTS 
LLAYKDSLEHIKSYQLKGEDATLSDGGSYQILTKKHLIAVQNRIKKELDKKRSKTLKTS 

ILYEDYYGTTASNDSSTYSSTQENNYNTTPYSEAPPSYSGNTTYSSETNQTTHQNYYNS 

ELY NTLFRST TKLQWYYLFDSNNSSKLLHSCLQH 

ILYEDYYGTTASNDSSTYSSTQENNYNTTPYSEAPPSYSGNTTYSSETNQTTHQNYYNS 
ILYEDYYGTTASNDSSTYSSTQENNYNTTPYSEAPPSYSGNTTYSSETNQTTHQNYYNS 
ILYEDYYGTTASNDSSTYSSTQENNYNTTPYSEAPPSYSGNTTYSSETNQTTHQNYYNS 

ILYEDYYGTTASNDSSTYSSTQENNYNTTPYSEAPPSYSGNTTYFDSNNSSKLL 

ILYEDYYGTT7VSNDSSTYSSTQENYYYTTPLFRSTTKLQWYYLFDSNNSSKLLHSCLQH 
ILYEDYYGTTASNDSSTYSSTQENNYNTTPYSEAPPSYSGNTTYSSETNQTTHQNYYNS 
ILYEDYYGTTASNDSSTYSSTQENNYNTTPYSEAPPSYSGNTTYSSETNQTTHQNYYNS 
ILYEDYYGTTASNDSSTYSSTQENNYNTTPYSEAPPSYSGNTTYSSETNQTTHQSYYNS 

TPASNYS SNTNTGQADS SGSVNNHNGAAT PNP 

RSGFKWKCQSWGCNAS 

TPASNYSSNTNTGQAD S SGSVNNHNGAAT PN P 

TPASNYSSNTNTGQADSSGSVNNHNGAATPNP 

T PASNYS SNTNTGQADS SGSVNNHNGAAT PNP 



RSGFKWKCLRGCNASKHRNATSTRSNS 

T PAS NY S SNTNTGQAD S S GS VNNHNGAAT PN P 

TPASN YS SNTNTGQADS SGSVNNHNGAAT PNP 

TPASNYSSNTNTGQADSSGSVNNYNGAATPNPNTGTQPVPGQTNP 



T^le 26: Comparative Sequences relating to SAG0503 (lipase/acylhydolase) 

OT^ID NO. 2601: SAG0503 FROM THE 090 GBS TYPE la STRAIN 
(REVERSE COMPLEMENT) 

GGGCACAAGTTTGTACAAAAAAGCAGGCTCTATTTTTTCCTTGATCATTCCAA^TCAAATCCTAAATTAACAAAAAA 
AGACTTCCTAACAAAGAAAGTTATCCCACTTAACTATGTTGCTCTTGGAGATTCTCTGACCGAAGGTGTGGGCGATAC 
AACCTCTCAAGGTGGTTTTGTCCCACTGCTATCAGAATCACTCCATAATCGATACTCTTACCAAGTGACTTCTGTTAA 
TTATGGTGTGTCTGGGAATACTAGTC^CAAATTTTAAAACGTATGACGACAGATCCTCAAATCGAAAAAGATTTAGA 
GAAAGCTGATTTATTGACGCT^CTGTTGGTGGTAATGATGTCTTGGCTGTTATTCGTAAAGAGCTCAGTCATTTATC 
ACTAAATTCCTTTGAGAT^CCAGCAGAAGCATATAAGGAACGTTTGAAAGAAATACTTGCAAAAGCT^AGACAAGATAA 
TCCTAAATTGCCTATTTATGTTTTAGGCATTTATAATCCTTTTTACCTAAACTTTCCACAATTAACTAAAATGCAAAC 
CGT TAT TGATAATTGGAATAAAGCT ACAAAAGAAGTAGT TGATGCTTCAGAAAATGTTT ATTTTGTCCCAATT AATGA 
CCGCCTTTATAAGGGAATAAATGGTAAAGAGGGTATTACAGAGTCATCAAATAGTCAGGCAAGTATCACTAATGATGC 
TCTCTTTACTGGAGACCATTTTCATCCCAATAATATTGGCTATCAAATCATGTCTAACGCCGTTATGGAGAAAATAAA 
TGAAACAAGAAAAAACTGGCCGAACCCAGCTTTCTTGTACAAAG 

SEQ ID NO. 2602: SAG0503 FROM THE H36b GBS TYPE lb STRAIN 
(REVERSE COMPLEMENT) 

TTTGTACAAAAAAGCAGGCTCTATTTTTTCCTTGATCATTCCAAAATCAAATCCTAAATTAACAAAAAAAGACTTCCT 
AACAAAGAAAGTTATCCCACTTAACTATGTTGCTCTTGGAGATTCTCTGACCGAAGGTGTGGGCGATACAACCTCTCA 
AGGTGGTTTTGTTCCACTGCTATCAGAATCACTCCATAATCGATACTCTTACCAAGTGACTTCTGTTAATTATGGTGT 
GTCTGGGAATACTAGTCAACAAATTTTAAAACGTATGACGACAGATCCTCAAATCGAAAAAGATTTAGAGAAAGCTGA 
TTTATTGACGCTAACTGTTGGTGGTAATGATGTCTTGGCTGTTATTCGTAAAGAGCTCAGTCATTTATCACTAAATTC 
CTTTGAGAAACCAGCAGAAGCATATAAGGAACGTTTGAAAGAAATCCTTGCAAAAGCAAGACAAGATAATCCTAAATT 
GCCTATTTATGTTTTAGGCATTTATAATCCTTTTTACCTAAACTTTCCACAATTAACTAAAATGCAAACCGTTATTGA 
TAATTGGAATAAAGCTACAAAAGAAGTAGTTGATGCTTCAGAAT^ATGTTTATTTTGTCCCAATTAATGACCGCCTTTA 
TAAGGG7\ATAAATGGTAAAGAGGGTATTATAGAGTCATCAAATAGTCAGGCAAGTATCACTAATGATGCTCTCTTTAC 
TGGAGACCATTTTCATCCCAATAATATTGGCTATCAAATCATGTCTAACGCCGTTATGGAGAAAATAAATGAAACAAG 
AAAAAACTGGCCGAACCCAGCTTTCTTGTACAAAGTGGTCC 

SEQ ID NO. 2603: SAG0503 FROM THE 18RS21 GBS TYPE II STRAIN (REVERSE 
COMPLEMENT) 

GTTTGTACAAAAAAGCAGGCTCTATTTTTTCCTTGATCATTCCAAAATCAAATCCTAAATTAACAAAAAAAGACTTCC 
TAACAAAGAAAGTTATCCCACTTAACTATGTTGCTCTTGGAGATTCTCTGACCGAAGGTGTGGGCGATACAACCTCTC 
AAGGTGGTTTTGT TCCACTGCTAT C AGAATCACTCCATAAT CGAT ACTCTTACCAAGTGACTTCTGTTAATT ATGGTG 
TGTCTGGGAATACTAGTC^CAAATTTTAAAACGTATGACGAGAGATC^ 

ATTTATTGACGCTAACTGTTGGTGGTAATGATGTCTTGGCTGTTATTCGTAAAGAGCTCAGTCATTTATCACTAZWVTT 
CCTTTGAGAAACCAGCAGAAGCATATAAGGAACGTTTGAAAGAAATCCTTGCAAAAGCAAGACAAGATAATCCTAAAT 
TGCCTATTTATGTTTTAGGCATTTATAATCCTTTTTACCTAAACTTTCCACAATTAACTAAAATGCAAACCGTTATTG 
ATAATTGGAATAAAGCTACAAAAGAAGTAGTTGATGCTTCAGAAAATGTTTATTTTGTCCCAATTAATGACCGCCTTT 
ATAAGGGAATAAATGGTAAAGAGGGTATTACAGAGTCATCAAATAGTCAGGCAAGTATCACTAATGATGCTCTCTTTA 
CTGGAGACCATTTT CATCCCAATAATATTGGCT ATCAAATC ATGTCTAACGCCGT TATGGAGAAAATAAATGAAACAA 
GAAAAAACTGGCCGAACCCAGCTTTCTTGTACAA 

SEQ ID NO. 2604: SAG0503 FROM THE COH1 GBS TYPE III STRAIN (REVERSE 
COMPLEMENT) 

GGACAAGTTTGTACAAAAAAGCAGGCTCTATTTTTTCCTTGATCATTCCAAAATCAAATCCTAAATTAACAAAAAAAG 
ACTTCCTAACAAAGAAAGTTATCCCACTTAACTATGTTGCTCTTGGAGATTCTCTGACCGAAGGTGTGGGGGATACAA 
CCTCTCAAGGTGGTTTTGTCCCACTGCTATCAGAATCACTCCATAATCGATACTCTTACCAAGTGACTTCTGTTAATT 
ATGGTGTGTCTGGGAATACTAGTCAACAAATTTTAAAACGTATGACGACAGATCCTCAAATCGAAAAAGATTTAGAGA 
AAGCTGATTTATTGACGCTAACTGTTGGTGGTAATGATGTCTTGGCTGTTATTCGTAAAGAGCTCAGTCATTTATCAC 
TAAATTCCTTTGAGAAACCAGCAGAAGCATATAAGGAACGTTTGAAAGAAATTCTTGCAAAAGCAAGAC^GATT^ATC 
CTAAATTGCCTATTTATGTTTTAGGCATTTATAATCCTTTTTACCTAAACTTTCCACAATTAACTAAAATGCAAACCG 
TTATTGATAATTGGAATAAAGCTACAAAAGAAGTAGTTGATGCTTCAGAAAATGTTTATTTTGTCCCAATTAATGACC 
GCCTTTATAAGGGAATAAATGGTAAAGAGGGTATTACAGAGTCATCAAATAGTCAGGCAAGTATCACTAATGATGCTC 
TCTTTACTGGAGACCATTTTCATCCCAATAATATTGGCTATCAAATCATGTCTAACGCCGTTATGGAGAAAATAAATG 
AAACAAGAAAAAACTGGCCGAACCCAGCTTTCTTGTACAAA 
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Table 26: Comparative Sequences relating to SAG0S03 (lipase/acylhydolRe) 

S^Bro NO. 2605: SAG0503 FROM THE CJB110 GBS NONTYPEABLE STRAIN (REVERSE 
COMPLEMENT) 

GTTTGTACAAAAAAGCAGGCTCTATTTTTTCCTTGAT<^ 

TAACAAAGAAAGTTATCCCACTTAACTATGTTGCTCTTGGAGATTCTCTGACCGAAGGTGTGGGCGATACAACCTCTC 
AAGGTGGTTTTGTCCCACTGCTATCAGAATCACTCCATAATCGATACTCTTACCAAGTGACTTCTGTTAATTATGGTG 
TGTCTGGGAATACTAGTCAACAAATTTTAAAACGTATGACGACAGATCCTCAAATCGAAAAAGATTTAGAGAAAGCTG 
ATTTATTGACGCTAACTGTTGGTGGTAATGATGTCTTGGCTGTTATTCGTAAAGAGCTCAGTCATTTATCACTAAATT 
CCTT TGAGAAACCAGCAGAAGCAT ATAAGGAACGT TTGAAAGAAAT ACT TGCAAAAGCAAGACAAGAT AATCCT AAAT 
TGCCTATTTATGTTTTAGGCATTTATAATCCTTTTTACCTAAACTTTCCACAATTAACTAAAATGCAAACCGTTATTG 
ATAATTGGAATAAAGCTACAAAAGAAGTAGTTGATGCTTCAGAAAATGTTTATTTTGTCCCAATTAATGACCGCCTTT 
ATAAGGGAATAAATGGTAAAGAGGGTATTACAGAGTCATCAAATAGTCAGGCAAGTATCACTAATGATGCTCTCTTTA 
CTGGAGACCATTTTCATCCCAATAATATTGGCTATCAAATCATGTCTAACGCCGTTATGGAGAAAATAAATGAAACAA 
GAAAAAACTGGCCGAACCCAGCTTTCTTGTACAA 

SEQ ID NO. 2606: SAG0503 FROM THE 1169NT1 GBS TYPE V STRAIN (REVERSE 
COMPLEMENT) 

GTTTGTACAAAAAAGCAGGCTCTATTTTTTCCTTGATCATTCCAAAATCAAATCCTAAATTAACAAAAAAAGACTTCC 
TAACAAAGAAAGTTATCCCACTTAACTATGTTGCTCTTGGAGATTCTCTGACCGAAGGTGTGGGGGATACAACCTCTC 
AAGGTGGTTTTGTCCCACTGCTATCAGAATCACTCCATAATCGATACTCTTACCAAGTGACTTCTGTTAATTATGGTG 
TGTCTGGGAATACTAGTCAACAAATTTTAAAACGTATGACGACAGATCCTCAAATCGAAAAAGATTTAGAGAAAGCTG 
ATTTATTGACGCTAACTGTTGGTGGTAATGATGTCTTGGCTGTTATTCGTAAAGAGCTCAGTCATTTATCACTAAATT 
CCTTTGAGAAACCAGCAGAAGCAT ATAAGGAACGT T TGAAAGAAATTCT TGCAAAAGCAAGAC AAGATAATCCT AAAT 
TGCCTATTTATGTTTTAGGCATTTATAATCCTTTTTACCTAAACTTTCCACAATT7VACTAAAATGCAAACCGTTATTG 
ATAATTGGAAT AAAGCTACAAAAGAAGTAGTTGATGCT TCAGAAAATGT TT ATTTTGTCCCAATTAATGACCGCCTT T 
ATAAGGGAATAAATGGTAAAGAGGGTATTACAGAGTCATCAAATAGTCAGGO^AGTATCACTAATGATGCTCTCTTTA 
CTGGAGACC^TTTTCATCCCAATAATATTGGCTATCAAATCATGTCTAACGCCGTTATGGAGAAAATAAATGAAACAA 

GAAAAAACTGGCCGAACCCAGCT T TCTTGTACAAA 

SEQ ID NO. 2607: SAG0503 FROM THE JM9130013 GBS TYPE VIII STRAIN 
(REVERSE COMPLEMENT) 

GTTTGTACAAAAAAGCAGGCTCTATTTTTTCCTTGATCATTCCAAAATCAAATCCTAAATTAACAAAAAAAGACTTCC 

T7UVCAAAGAAAGTTATCCCACTTAACTATGTTGCTCTTGGAGATTCTCTGACCGAAGGTGTGGGCGATACAACCTCTC 

AAGGTGGTTTTGTTCCACTGCTATCAGAATCACTCCATAATCGATACTCTTACCAAGTGACTTCTGTTAATTATGGTG 

TGTCTGGGAATACTAGTCAACAAATTTTAAAACGTATGACGACAGATCCTCAAATCGAAAAAGATTTAGAGAAAGCTG 

ATTTATTGACGCTAACTGTTGGTGGTAATGATGTCTTGGCTGTTATTCGTAAAGAGCTCAGTCATTTATCACTAAATT 

CCTTTGAGAAACCAGCAGAAGCATATAAGGAACGTTTGAAAGAAATCCTTGCAAAAGCAAGACAAGATAATCCTAAAT 

TGCCTATTTATGTTTTAGGCATT T ATAATCCTTTTTACCTAAACTTTCCACAATTAACTAAAATGCAT^ACCGT TAT TG 

ATAATTGGAAT AAAGCTACAAAAGAAGTAGTTGATGCTTCAGAAAATGT TTAT TT TGTCCCAATTAATG ACCGCCTTT 

ATAAGGGAATAAATGGTAAAGAGGGTATTACAGAGTCATCAAATAGTCAGGCAAGTATCACTAATGATGCTCTCTTTA 

CTGGAGACGATTTTCATCCGAATAATATTGGCTATGAAATC^TC 

GAAAAAACTGGCCGAACCCAGCTTTCTTGTACAAA 

SEQ ID NO. 2608: SAG0503 FROM THE 2603 V/R GBS TYPE V STRAIN 
(REVERSE COMPLEMENT) 

AGTTTGTACAAAAT^GCAGGCTCTATTTTTTCCTTGATCATTCCAAAATCAAATCCTAAATTAACAAAAAAAGACTTC 
CTAACAAAGAAAGTTATCCCACTTAACTATGTTGCTCTTGGAGATTCTCTGACCGAAGGTGTGGGCGATACAACCTCT 
CAAGGTGGTTTTGTTCCACTGCTATCAGAATCACTCCATAATCGATACTCTTACCAAGTGACTTCTGTTAATTATGGT 
GTGTCTGGGAATACTAGTCAACAAATTTTAAAACGTATGACGACAGATCCTCAAATCGAAAAAGATTTAGAGAAAGCT 
GATTTATTGACGCTAACTGTTGGTGGTAATGATGTCTTGGCTGTTATTCGTAAAGAGCTCAGTCATTTATCACTAAAT 
TCCTTTGAGAAACCAGCAGAAGC^TATAAGGAACGTTTGAAAGAAATCCTTGCAAAAGCAAGACAAGATAATCCTAAA 
TTGCCTATTTA^TGTTTTAGGCATTTATAATCCTTTTTACCTAAACTTTCCACAATTAACTAAAATGCAAACCGTTATT 
GATAATTGGAATAAAGCTACAAAAGAAGTAGTTGATGCTTCAGAAAATGTTTATTTTGTCCCAATTAATGACCGCCTT 
TATAAGGGAATAAATGGTAAAGAGGGTATTACAGAGTC^TC^ 

ACTGGAGACCATTTTCATCCCAATAATATTGGCTATCAAATCATGTCTAACGCCGTTATGGAGAAAATAAATGAAACA 
AGAAAAAACTGGCCGAACCCAGCTTTCTTGTACAAAGTGG 
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26: Comparative Sequences relating t SAG0503 (lipase/acylhyd 
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NO. 2609: SAG0503 FROM THE M781 GBS TYPE III STRAIN 
(REVERSE COMPLEMENT) 

GGACAAGTT TGTACAAAAAAGCAGGCTCTATTTT TTCCTTG ATCATTCCAAAATCAAATCCTAAATTAACAAAAAAAG 
ACTTCCTAACAAAGAAAGTTATCCCACTTAACTATGTTGCTCTTGGAGATTCTCTGACCGAAGGTGTGGGGGATACAA 
CCTCTCAAGGTGGTTTTGTCCCACTGCTATCAGAATCACTCCATAATCGATACTCTTACCAAGTGACTTCTGTTAATT 
ATGGTGTGTCTGGGAATACTAGTCAACAAATTTTAAAACGTATGACGACAGATCCTCAAATCGAAAAAGATTTAGAGA 
AAGCTGATTTATTGACGCTAACTGTTGGTGGTAATGATGTCTTGGCTGTTATTCGTAAAGAGCTCAGTCATTTATCAC 
TAAATTCCTTTGAGAAACCAGCAGAAGCATATAAGGAACGTTTGAAAGA7UVTTCTTGCAAAAGCAAGACAAGATAATC 
CTAAATTGCCTATTTATGTTTTAGGCATTTATAATCCTTTTTACCTAAACTTTCCACAATTAACTAAAATGCAAACCG 
TT AT TGATAATT GGAATAAAGCTACAAAAGAAGT AGTTGATGCT TCAGAAAATGTTT ATTT TGTCCCAATTAATGACC 
GCCTTTATAAGGGAATAAAT GGTAAAGAGGGT ATT ACAGAGT CAT CAAATAGTCAGGCAAGT ATCACTAAT GATGCTC 
TCTTTACTGGAGACCATTTTCATCCCAATAATATTGGCTATCAAATCATGTCTAACGCCGTTATGGAGAAAATAAATG 
AAACAAGAAAAAACTGGCCGAACCCAGCTTTCTTGTACAAA 



SEQ2601 GGCACAAGTTTGTACAAAAAAGCAGGCTCTATTTTTTCCTTGATCATTCCAAAATCAAA 

SEQ2602 TTTGTACAAAAAAGCAGGCTCTATTTTTTCCTTGATCATTCCAAAATCAAA 

SEQ2 603 GTTTGTACAAAAAAGCAGGCTCTATTTTTTCCTTGATCATTCCAAAATCAAA 

SEQ2604 GGACAAGTTTGTACAAAAAAGCAGGCTCTATTTTTTCCTTGATCATTCCAAAATCAAA 

SEQ2 605 GTTTGTACAAAAAAGCAGGCTCTATTTTTTCCTTGATCATTCCAAAATCAAA 

SEQ2606 GTTTGTACAAAAAAGCAGGCTCTATTTTTTCCTTGATCATTCCAAAATCAAA 

SEQ2607 GTTTGTACAAAAAAGCAGGCTCTATTTTTTCCTTGATCATTCCAAAATCAAA 

SEQ2 608 AGTTTGTACAAAAAAGCAGGCTCTATTTTTTCCTTGATCATTCCAAAATCAAA 

SEQ2 60 9 GGACAMTTTGTACAAAAAAGCAGGCTCTATTTTTTCCTTGATCATTCCAAAATCAAA 

SEQ2601 TCCTAAATTAACAAAAAAAGACTTCCTAACAAAGAAAGTTATCCCACTTAACTATGTTGC 

SEQ2602 TCCTAAATTAACAAAAAAAGACTTCCTAACAAAGAAAGTTATCCCACTTAACTATGTTGC 

SEQ2603 TCCTAAATTAACAAAAAAAGACTTCCTAACAAAGAAAGTTATCCCACTTAACTATGTTGC 

SEQ2604 TCCTAAATTAACAAAAAAAGACTTCCTAACAAAGAAAGTTATCCCACTTAACTATGTTGC 

SEQ2 605. TCCTAAATTAACAAAAAAAGACTTCCTAACAAAGAAAGTTATCCCACTTAACTATGTTGC 

SEQ2 606 TCCTAAATTAAC^\AAAAAAGACTTCCTAACAAAGAAAGTTATCCCACTTAACTATGTTGC 

SEQ2607 TCCTAAATTAACAAAAAAAGACTTCCTAACAAAGAAAGTTATCCCACTTAACTATGTTGC 

SEQ2 608 TCCTAAATTAACAAAAAAAGACTTCCTAACAAAGAAAGTTATCCCACTTAACTATGTTGC 

SEQ2609 TCCTAAATTAACAAAAAAAGACTTCCTAAC71AAGAAAGTTATCCCACTTAACTATGTTGC 

SEQ2601 TCTTGGAGATTCTCTGACCGAAGGTGTGGGCGATACAACCTCTCAAGGTGGTTTTGTCCC 

SEQ2602 TCTTGGAGATTCTCTGACCGAAGGTGTGGGCGATACAACCTCTCAAGGTGGTTTTGTTCC 

SEQ2603 TCTTGGAGATTCTCTGACCGAAGGTGTGGGCGATACAACCTCTCAAGGTGGTTTTGTTCC 

SEQ2604 TCTTGGAGATTCTCTGACCGAAGGTGTGGGGGATACAACCTCTCAAGGTGGTTTTGTCCC 

SEQ2605 TCTTGGAGATTCTCTGACCGAAGGTGTGGGCGATACAACCTCTCAAGGTGGTTTTGTCCC 

SEQ2606 TCTTGGAGATTCTCTGACCGAAGGTGTGGGGGATACAACCTCTCAAGGTGGTTTTGTCCC 

SEQ2607 TCTTGGAGATTCTCTGACCGAAGGTGTGGGCGATACAACCTCTCAAGGTGGTTTTGTTCC 

SEQ2 608 TCTTGGAGATTCTCTGACCGAAGGTGTGGGCGATACAACCTCTCAAGGTGGTTTTGTTCC 

SEQ2 609 TCTTGGAGATTCTCTGACCGAAGGTGTGGGGGATACAACCTCTCAAGGTGGTTTTGTCCC 

SEQ2601 ACTGCTATCAGAATCACTCCATAATCGATACTCTTACCAAGTGACTTCTGTTAATTATGG 

SEQ2602 ACTGCTATCAGAATCACTCCATAATCGATACTCTTACCAAGTGACTTCTGTTAATTATGG 

SEQ2603 ACTGCTATCAGAATCACTCCATAATCGATACTCTTACCAAGTGACTTCTGTTAATTATGG 

SEQ2604 ACTGCTATCAGAATCACTCCATfiATCGATACTCTTACCAAGTGACTTCTGTTAATTATGG 

SEQ2 605 ACTGCTATCAGAATCACTCCATAATCGATACTCTTACCAAGTGACTTCTGTTAATTATGG 

SEQ2606 ACTGCTATCAGAATCACTCCATAATCGATACTCTTACCAAGTGACTTCTGTTAATTATGG 

SEQ2607 ACTGCTATCAGAATCACTCCATAATCGATACTCTTACCAAGTGACTTCTGTTAATTATGG 

SEQ260B ACTGCTATCAGAATCACTCCATAATCGATACrCTTACCAAGTGACTTCTGTTAATTATGG 

SEQ2 609 ACTGCTATCAGAATCACTCCATAATCGATACTCTTACCAAGTGACTTCTGTTAATTATGG 

SEQ2601 TGTGTCTGGGAATACTAGTCAACAAATTTTAAAACGTATGACGACAGATCCTCAAATCGA 

SEQ2602 TGTGTCTGGGAATACTAGTCAACAAATTTTAAAACGTATGACGACAGATCCTCAAATCGA 

SEQ2603 TGTGTCTGGGAATACTAGTCAACAAATTTTAAAACGTATGACGACAGATCCTCAAATCGA 

SEQ2604 TGTGTCTGGGAATACTAGTCAACAAATTTTAAAACGTATGACGACAGATCCTCAAATCGA 

SEQ2605 TGTGTCTGGGAATACTAGTCAACAAATTTTAAAACGTATGACGACAGATCX^ 

SEQ2606 TGTGTCTGGGAATACTAGTCAACAAATTTTAAAACGTATGACGACAGATCCTCAAATCGA 

SEQ2607 TGTGTCTGGGAATACTAGTCAACAAATTTTAAAACGTATGACGACAGATCCTCAAATCGA 

SEQ2608 TGTGTCTGGGAATACTAGTCAACAAATTTTAAAACGTATGACGACAGATCCTCAAATCGA 

SEQ2609 TGTGTCTGGGAATACTAGTCAACAAATTTTAAAACGTATGACGACAGATCCTCAAATCGA 

SEQ2601 AAAAGATTTAGAGAAAGCTGATTTATTGACGCTAACTGTTGGTGGTAATGATGTCTTGGC 

SEQ2602 AAAAGATTTAGAGAAAGCTGATTTATTGACGCTAACTGTTGGTGGTAATGATGTCTTGGC 

SEQ2 603 AAAAGATTTAGAGAAAGCTGATTTATTGACGCTAACTQTTGGTGGTAATGATGTCTTGGC 
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T^e26: 

SEQ2605 
SEQ2606 
SEQ2607 
SEQ2608 
SEQ2609 



SEQ2601 
SEQ2602 
SEQ2603 
SEQ2604 
SEQ2605 
SEQ2606 
SEQ2607 
SEQ2608 
SEQ2609 

SEQ2601 
SEQ2602 
SEQ2603 
SEQ2604 
SEQ2605 
SEQ2606 
SEQ2607 
SEQ2608 
SEQ2609 

SEQ2601 
SEQ2602 
SEQ2603 
SEQ2604 
SEQ2605 
SEQ2606 
SEQ2607 
SEQ2608 
SEQ2609 

SEQ2601 
SEQ2602 
SEQ2603 
SEQ2604 
SEQ2605 
SEQ2606 
SEQ2607 
SEQ2608 
SBQ2609 

SEQ2601 
SEQ2602 
SEQ2603 
SEQ2604 
SEQ2605 
SEQ2606 
SEQ2607 
SEQ2608 
SEQ2609 

SEQ2601 
SEQ2602 
SEQ2603 
SEQ2604 
SEQ2605 
SEQ2606 
SEQ2607 
SEQ2608 
SEQ2609 

SEQ2601 
SEQ2602 
SEQ2603 



Comparative Sequences relating to SAG0503 (lipase/acylhydo 



AAAAGATTTAGAGAAAGCTGATTTATTGACGCTAACTGTTGGTGGTAATGATGTCTTGGC 
AAAAGATTTAGAGAAAGCTGATTTATTGACGCTAACTGTTGGTGGTAATGATGTCTTGGC 
AAAAGATTTAGAGAAAGCTGATTTATTGACGCTAACTGTTGGTGGTAATGATGTCTTGGC 
AAAAGATTTAGAGAAAGCTGATTTATTGACGCTAACTGTTGGTGGTAATGATGTCTTGGC 
AAAAGATTTAGAGAAAGCTGATTTATTGACGCTAACTGTTGGTGGTAATGATGTCTTGGC 
AAAAGATTTAGAGAAAGCTGATTTATTGACGCTAACTGTTGGTGGTAATGATGTCTTGGC 

TGTTATTCGTAAAGAGCTCAGTCATTTATCACTAAATTCCTTTGAGAAACCAGCAGAAGC 
TGTTATTCGTAAAGAGCTCAGTCATTTATCACTAAATTCCTTTGAGAAACCAGCAGAAGC 
TGTTATTCGTAAAGAGCTCAGTCATTTATCACTAAATTCCTTTGAGAAACCAGCAGAAGC 
TGTTATTCGTAAAGAGCTCAGTCATTTATCACTAAATTCCTTTGAGAAACCAGCAGAAGC 
TGTTATTCGTAAAGAGCTCAGTCATTTATCACTAAATTCCTTTGAGAAACCAGCAGAAGC 
TGTTATTCGTAAAGAGCTCAGTCATTTATCACTAAATTCCTTTGAGAAACCAGCAGAAGC 
TGTTATTCGTAAAGAGCTCAGTCATTTATCACTAAATTCCTTTGAGAAACCAGCAGAAGC 
TGTTATTCGTAAAGAGCTCAGTCATTTATCACTAAATTCCTTTGAGAAACCAGCAGAAGC 
TGTTATTCGTAAAGAGCTCAGTCATTTATCACTAAATTCCTTTGAGAAACCAGCAGAAGC 

ATATAAGGAACGTTTGAAAGAAATACTTGCAAAAGCAAGACAAGATAATCCTAAATTGCC 
ATATAAGGAACGTTTGAAAGAAATCCTTGCAAAAGCAAGACAAGATAATCCTAAATTGCC 
ATATAAGGAACGTTTGAAAGAAATCCTTGCAAAAGCAAGACAAGATAATCCTAAATTGCC 
ATATAAGGAACGTTTGAAAGAAATTCTTGCAAAAGCAAGACAAGATAATCCTAAATTGCC 
ATATAAGGAACGTTTGAAAGAAATACTTGCAAAAGCAAGACAAGATAATCCTAAATTGCC 
ATATAAGGAACGTTTGAAAGAAATTCTTGCAAAAGCAAGACAAGATAATCCTAAATTGCC 
ATATAAGGAACGTTTGAAAGAAATCCTTGCAAAAGCAAGACAAGATAATCCTAAATTGCC 
ATATAAGGAACGTTTGAAAGAAATCCTTGCAAAAGCAAGACAAGATAATCCTAAATTGCC 
ATATAAGGAACGTTTGAAAGAAATTCTTGCAAAAGCAAGACAAGATAATCCTAAATTGCC 

TATTTATGTTTTAGGCATTTATAATCCTTTTTACCTAAACTTTCCACAATTAACTAAAAT 
TATTTATGTTTTAGGCATTTATAATCCTTTTTACCTAAACTTTCCACAATTAACTAAAAT 
TATTTATGTTTTAGGCATTTATAATCCTTTTTACCTAAACTTTCCACAATTAACTAA7^AT 
TATTTATGTTTTAGGCATTTATAATCCTTTTTACCTAAACTTTCCACAATTAACTAAAAT 
TATTTATGTTTTAGGCATTTATT^ATCCTTTTTACCTAAACTTTCCACAATTAACTAAAAT 
TATTTATGTTTTAGGCATTTATAATCCTTTTTACCTAAACTTTCCACAATTAACTAAAAT 
TATTTATGTTTTAGGCATTTATAATCCTTTTTACCTAAACTTTCCACAATTAACTAAAAT 
TATTTATGTTTTAGGCATTTATAATCCTTTTTACCTAAACTTTCCACAATTAACTAAAAT 
TATTTATGTTTTAGGCATTTATAATCCTTTTTACCTAAACTTTCCACAATTAACTAAAAT 

GCAAACCGTTATTGATAATTGGAATAAAGCTACAAAAGAAGTAGTTGATGCTTCAGAAAA 
GCAAACCGTTATTGATAATTGGAATAAAGCTACAAAAGAAGTAGTTGATGCTTCAGAAAA 
GCAAACCGTTATTGATAATTGGAATAAAGCTACAAAAGAAGTAGTTGATGCTTCAGAAAA 
GCAAACCGTTATTGATAATTGGAATAAAGCTAC^AAAGAAGTAGTTGATGCTTCAGAAAA 
GCAAACCGTTATTGATAATTGGAATAAAGCTACAAAAGAAGTAGTTGATGCTTCAGAAAA 
GCAAACCGTTATTGATAATTGGAATAAAGCTACAAAAGAAGTAGTTGATGCTTCAGAAAA 
GCAAACCGTTATTGATAATTGGAATAAAGCTACAAAAGAAGTAGTTGATGCTTCAGAAAA 
GCAAACCGTTATTGATAATTGGAATAAAGCTACAAAAGAAGTAGTTGATGCTTCAGAAAA 
GCAAACCGTTATTGATAATTGGAATAAAGCTACAAAAGAAGTAGTTGATGCTTCAGAAAA 

TGTTTATTTTGTCCCAATTAATGACCGCCTTTATAAGGGAATAAATGGTAAAGAGGGTAT 
TGTTTATTTTGTCCCAATTAATGACCGCCTTTATAAGGGAATAAATGGTAAAGAGGGTAT 
TGTTTATTTTGTCCCAATTAATGACCGCCTTTATAAGGGAATAAATGGTAAAGAGGGTAT 
TGTTTATTTTGTCCCAATTAATGACCGCCTTOATAAGGGAATAAATGGTAAAGAGGGTAT 
TGTTTATTTTGTCCCAATTAATGACCGCCTTTATAAGGGAATAAATGGTAAAGAGGGTAT 
TGTTTATTTTGTCCCAATTAATGACCGCCTTTATAAGGGAATAAATGGTAAAGAGGGTAT 
TGTTTATTTTGTCCCAATTAATGACCGCCTTTATAAGGGAATAAATGGTAAAGAGGGTAT 
TGTTTATTTTGTCCCAATTAATGACCGCCTTTATAAGGGAATAAATGGTAAAGAGGGTAT 
TGTTTATTTTGTCCCAATTAATGACCGCCTTTATAAGGGAATAAATGGTAAAGAGGGTAT 

TACAGAGTCATCAAATAGTCAGGCAAGTATCACTAATGATGCTCTCTTTACTGGAGACCA 
TATAGAGTCATCAAATAGTCAGGCAAGTATCACTAATGATGCTCTCTTTACTGGAGACCA 
TACAGAGTCATCAAATAGTCAGGCAAGTATCACTAATGATGCTCTCTTTACTGGAGACCA 
TACAGAGTCATCAAATAGTCAGGCAAGTATCACTAATGATGCTCTCTTTACTGGAGACCA 
TACAGAGTCATCAAATAGTCAGGCAAGTATCACTAATGATGCTCTCTTTACTGGAGACCA 
TACAGAGTCATCAAATAGTCAGGCAAGTATCACTAATGATGCTCTCTTTACTGGAGACCA 
TACAGAGTCATCAAATAGTCAGGCAAGTATCACTAATGATGCTCTCTTTACTGGAGACCA 
TACAGAGTCATCAAATAGTCAGGCAAGTATCACTAATGATGCTCTCTTTACTGGAGACCA 
TACAGAGTCATCAAATAGTCAGGCAAGTATCACTAATGATGCTCTCTTTACTGGAGACCA 

TTTTCATCCCAATAATATTGGCTATCAAATCATGTCTAACGCCGTTATGGAGAAAAT7VAA 
TTTTCATCCCAATAATATTGGCTATCAAATCATGTCTAACGCCGTTATGGAGAAAATAAA 
TTTTCATCCCAAT AATATTGGCT AT C AAATCATGT CTAACGCCGTTATGGAGAAAATAAA 
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e 26: Comparative Sequences relating to SAG0503 (lipase/acylhydolase) 



34 TTTTCATCCCAATAATATTGGCTATCAAATCATGTCTAACGCCGTTATGGAGAAAATAAA 

• SEQ2605 TTTTCATCCCAATAATATTGGCTATCAAATCATGTCTAACGCCGTTATGGAGAAAATAAA 

SEQ2606 TTTTCATCCCAATAATATTGGCTATCAAATCATGTCTAACGCCGTTATGGAGAAAATAAA 

SEQ2607 TTTTCATCCCAATAATATTGGCTATCAAATCATGTCTAACGCCGTTATGGAGAAAATAAA 

SEQ2 608 TTTTCATCCCAATAATATTGGCTATCAAATCATGTCTAACGCCGTTATGGAGAAAATAAA 

SEQ2609 TTTTCATCCCAATAATATTGGCTATCAAATCATGTCTAACGCCGTTATGGAGAAAATAAA 

SEQ2601 TGAAACAAGAAAAAACTGGCCGAACCCAGCTTTCTTGTACAAAG 

SEQ2 602 TGAAACAAGAAAAAACTGGCCGAACCCAGCTTTCTTGTACAAAGTGGTCC 

SEQ2 603 TGAAACAAGAAAAAACTGGCCGAACCCAGCTTTCTTGTACAA 

SEQ2604 TGAAACAAGAAAAAACT GGCCGAACCCAGCTTTCTTGTAC AAA 

SEQ2 605 TGAAACAAGAAAAAACTGGCCGAACCCAGCTTTCTTGTACAA 

SEQ2606 TGAAACAAGAAAAAACTGGCCGAACCCAGCTTTCTTGTACAAA 

SEQ2607 TGAAACAAGAAAAAACTGGCCGAACCCAGCTTTCTTGTACAAA '« 

SEQ2 608 TGAAAC AAGAAAAAACTGGCCGAACCCAGCTTTCTTGTACAAAGTGG 

SEQ2609 TGAAACAAGAAAAAACTGGCCGAACCCAGCTTTCTTGTACAAATABCMARATVSTNCSRA 

SEQ2601 

SEQ2602 

SEQ2603 

SEQ2604 

SEQ2605 

SEQ2606 

SEQ2607 

SEQ2608 

SEQ2609 NGTSAGASACYHYDAS 



>SEQ ID NO 2650:103_090 frame: 2 

IFSLIIPKSNPKLTKKDFLTKKVIPLNYVALGDSLTEGVGDTTSQGGFVP 

LI»SESLHNRYSYQVTSVOTGVSGNTSQQIIJG^MTTDPQIEKDLEKADLLTLTVGGNDVIiA 

VIRKELSHLSLNSFEKPAEAYKERIiKEILAKARQDNPKLPIYVLGIYNPFYLNFPQLTKM 

QT^DNWNKATKEWDASENVYFVPINDIUjYKGINGKEGITESSNSQASITNDALFTGDH 

FHPNNIGYQIMSNAVMEKINETRKNWP 

>SEQ ID NO 2651:103_H36B fraae: 2 

IFSLIIPKSNPKLTKKDFLTKKVIPLNYVALGDSLTEGVGDTTSQGGFVPLLS 
ESLHNRYSYQVTSVNYGVSGNTSQQILKRMTT DPQIEKDLEKADLI/TLTVGGNDVLAV IR 
KELSHLSLNSFEKPAEAYICERLKEILAKARQDNPKLPIYVLGIYNPFYLNFPQLTKMQTV 
IDNWNKATKEWDASENVYFVPINDRLYKGINGKEGIIESSNSQASITNDALFTGDHFHP 
NNIGYQIMSNAVMEKINETRKNWP 

>SEQ ID NO 2652: 3.0 3__18RS21 frame: 3 

IFSLIIPKSNPKLTKKDFLTKKVIPLNWALGDSLTEGVGDTTSQGGFVPLLS 

ESLHNRYSYQVTSVNYGVSGNTSQQILKRMTTDPQIEKDLEKADLLTLTVGGNDVLAVIR 

KELSHLSLNSFEKPAEAYKERLKEILAKARQDNPKLPIYVLGIYNPFYLNFPQLTKMQTV 

IDNWWKATKEWDASENVYFVPINDRLYKGXNGKEGITESSNSQASITNDALFTGDHFHP 

NNIGYQIMSNAVMEKINETRKNWP 

>SBQ ID NO 2653-.103JCOH1 frame: 3 

IFSLIIPKSNPKLTKKDFLTKKVIPLNYVALGDSLTEGVGDTTSQGGFVPL 
LSESLHNRYSYQVTSVNYGVSGNTSQQILKRMTTDPQIEKDLEKADLLTLTVGGNDVLAV 
IRKELS HLSLN S FEKP AEAYKERLKE ILAKARQDN PKLPIYVLGI YN PFYLN FPQLTKMQ 
TVT DNWNKATKE WDASENVYFVPINDRLYKGINGKEGITES SN SQAS ITNDALFTGDHF 
HPNNIGYQIMSNAVMEKINETRKNWP 

>SEQ ID NO 2654:103_CJB110 frame: 3 

IFSLIIPKSNPKLTKKDFLTKKVIPLNYVALGDSLTEGVGDTTSQGGFVPLLS 

ESLHNRYSYQVTSVNYGVSGNTSQQILKRMTTDPQIEKDLEKADIaLTLTVGGNDVLAVIR 

KELSHLSLNSFEKPAEAYKERLKEIIJUCARQDNPKLPIYVLGIYNPFYIiNFPQLTKMQTV 

IDNWNKATKEWDASENVYFVPINDRLYKGINGKEGITESSNSQASITNDALFTGDHFHP 

NNIGYQIMSNAVMEKINETRKNWP 

>SEQ ID NO 2655:103 1169NT frame: 3 

I FSLIIPKSN PKLTKKDFLTKKVI PLNYVALGDSLTEGVGDTTSQGGFVPLLS 
ESLHNRYSYQWSVNYGVSGNTSQQILKRMTTDPQIEKDIiEKADLLTLTVGGNDVLAVIR 
KE L S HL S LN S FE KP AE AYKERLKE I LAKARQ DN PKLP I YVLG I YN P FYLN F PQLTKMQT V 
IDNWNKATKEWDASENVYFVPINDRLYKGINGKEGITESSNSQASITNDALFTGDHFHP 
NNIGYQIMSNAVMEKINETRKNWP 
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Table 26: Comparativ Sequences relating to SAG0503 (lipase/acylhydolase) 



raWe2 



>Hto NO 2656:103_JM9130013 frame: 3 

IF^IPKSNPKLTKKDFLTKKVIPLNYVALGDSLTEGVGDTTSQGGFVPLLS 
ESLHNRYSYQVTSVNYGVSGNTSQQILKRMTTDPQIEKDLBKADLLTLTVGGNDVIAVIR 
KELS HLSLNS FEKPAEAYKERLKE I LAKARQDNPKL P I Y VLG I YN P FYLN FPQLTKMQTV 
IDNWNKATKEWDASENVYFVPINDRLYKGINGKEGITESSNSQASITNDALFTGDHFHP 
NNIGYQIMSNAVMEKINETRKNWP 
>SEQ ID NO 2657:103_2603 frame: 1 

I FSLI'IPKSNPKLTKKDFLTKKVI PLNYVALGDSLTEGVGDTTSQGGFVPLL 
SESLHNRYSYQVTSVNYGVSGNTSQQILKRMTTDPQIEKDLEKADLLTLTVGGNDVLAVI 
RKELSHLSLNSFEKPAEAYKERLKEILAKARQDN PKLPI YVIiGIYN PFYLNFPQLTKMQT 
VI DNWNKATKEWDASENVYFVPINDRLYKGINGKEGITES SNSQAS ITNDALFTGDHFH 
PNNIGYQIMSNAVMEKINETRKNWP 

>SEQ ID NO 2658:103_M781 frame: 3 

IFSLIIPKSNPKLTKKDFLTKKVIPLNYVALGDSLTEGVGDTTSQGGFVPL 
LSESLHNRYSYQVTSVNYGVSGNTSQQILKRMTTDPQIEKDLEKADLLTLTVGGNDVLAV 
I RKEL SHLSLN S FEKPAEAYKERLKEILAKARQDN PKLPI YVLGIYN P FYLN FPQLTKMQ 
TVIDNWNKATKEVVDASENVYFVPINDRLYKGINGKEGITESSNSQASITNDALFTGDHF 

HPNNIGYQIMSNAVMEKINETRKNWP 



SEQ2650 
SEQ2651 
SEQ2652 
SEQ2.653 
SEQ2654 
SEQ2655 
SEQ2656 
SEQ2657 
SEQ2658 

SEQ2650 
SEQ2651 
SEQ2652 
SEQ2653 
SEQ2654 
SEQ2655 
SEQ2656 
SEQ2657 
SEQ2658 

SEQ2650 
SEQ2651 
SEQ2652 
SEQ2653 
SEQ2654 
SEQ2655 
SEQ2656 
SEQ2657 
SEQ2658 

SEQ2650 
SEQ2651 
SEQ2652 
SEQ2653 
SEQ2654 
SEQ2655 
SEQ2656 
SEQ2657 
SEQ2658 

SEQ2650 
SBQ2651 
SEQ2652 
SEQ2653 
SEQ2654 
SEQ2655 
SEQ2656 
SEQ2657 
8EQ2658 



IFSLIIPKSNPKLTKKDFLTKKVIPLNYVALGDSLTEGVGDTTSQGGFVPLLSESLHNRY 
I FSLIIPKSNPKLTKKDFLTKKVIPLNYVALGDSLTEGVGDTTSQGGFVPLLSESLHNRY 
IFSLIIPKSNPKLTKKDFLTKKVIPLNYVALGDSLTEGVGDTTSQGGFVPLLSESIjHNRY 
IFSLIIPKSNPKLTKKDFLTKKVIPLNYVALGDSLTEGVGDTTSQGGFVPLLSESLHNRY 
IFSLIIPKSNPKLTKKDFLTKKVIPLNYVALGDSLTEGVGDTTSQGGFVPLLSESLHNRY 
IFSLIIPKSNPKLTKKDFLTKKVIPLNYVALGDSLTEGVGDTTSQGGFVPLLSESLHNRY 
IFSLIIPKSNPKLTB^DFLTKKVIPLNYVALGDSLTEGVGDTTSQGGFVPLLSESLHNRY 
I FSLII PKSNPKLTKKDFLTKKVIPLNYVALGDSLTEGVGDTTSQGGFVPLLSESLHNRY 
IFSUIPKSNPKLTKKDFLTKKVIPLNYYALGDSLTEGVGDTTSQGGFVPLLSESLHNRY 

SYQVTSVNYGVSGNTSQQILKRMTTDPQIEKDLEKADLLTLTVGGNDVLAVIRKELSHLS 
SYQVTSVNYGVSGNTSQQILKRMTTDPQIEKDLEKADLLTLTVGGNDVLAVIRKEIiSHLS 
SYQVTSVNYGVSGNTSQQILKRMTTDPQIEKDLEKADLLTLTVGGNDVLAVIRKELSHLS 
SYQVTSVNYGVSGNTSQQILKRMTTDPQIEKDI^KADLLTLTVGGNDVLAVIRKELSHLS 
SYQVTSVNYGVSGNTSQQILKRMTTDPQIEKDLEKADLLTLTVGGNDVLAVIRKELSHLS 
S YQVTSVNYGVSGNT SQQILKRMTTDPQIEKDLEKADLLTLTVGGN DVLAVIRKELSHLS 
SYQVTSVOTGVSGNTSQQILKRMTTDPQIEKDLEKADLLTLTVGGNDVLAVIRKELSHLS 
SYQVTSVNYGVSGWTSQQILKRMTTDPQIEKDLEKADIiLTLTVGGNDVLAVIRKELSHLS 
SYQVTSVNYGVSGNTSQQILKRMTTDPQIEKDLEKADLLTLTVGGNDVLAVIRKELSHLS 

LNS FEKPAEAYKERLKEILAKARQDNPKLPI YVLGIYN PFYLNFPQLTKMQTVI DNWNKA 
LNSFEKPAEAYKERLKEILAKARQDNPKLPIYVLGIYNPFYLNFPQLTKMQTVIDNWNKA 
I^S FEKPAEAYKERLKEI LAKARQDNPKLPIYVLGira PFYLNFPQLTKMQTVI DNWNKA 
LNS FEKPAEAYKERLKE ILAKARQDN PKLPI YVLGIYN P FYLN FPQLTKMQTV I DNWNKA 
LNS FEKPAEAYKERLKEILAKARQDN PKLPI YVLGIYN PFYLN FPQLTKMQTVI DNWNKA 
LNSFEKPAE AYKERLKEI LAKARQDN PKLPI YVLGIYNPFYLN FPQLTKMQTVI DNWNKA 
LNSFEKPAEAYKERLKEILAKARQDNPKLPIYVLGIYNPFYLNFPOLTKMQTVIDNWNKA 
LNSFEKPAEAYKERLKEILAKARQDNPKLPIYVLGIYNPFYLNFPQLTKMQTVIDNWNKA 
LNS FEKPAEAYKERLKEILAKARQDN PKLPI YVLGIYN P FYLN FPQLTKMQTVI DNWNKA 

TKEWDASENVYFVPINDRLYKGINGKEGITESSNSQASITNDALFTGDHFHPNNIGYQI 
TKE WDASENVYFV P INDRLYKGINGKEGI IE S SNSQAS ITNDALFTGDHFHPN N IGYQ I 
TKEVVDASENVYFVPINDRLYKGINGKEGITESSNSQASITNDALFTGDHFHPNNIGYQI 
TKE WDASENVYFVPINDRLYKGINGKEGITE S SNSQAS ITNDALFTGDH FHPNN IGYQI 
TKEWDASENVYFVPINDRLYKGINGKEGITESSNSQASITNDALFTGDHFHPNNI<?YQI 
TKE WDASENVYFVPINDRLYKGINGKEGITES SN SQASITN DALFTGDHFHPNN IGYQI 
TKEWDASENVYFVPINDRLYKGINGKEGITESSNSQASITNDALFTGDHFHPNNIGYQI 
TKEWDASENVYFVPINDRLYKGINGKEGITESSNSQASITNDALFTGDHFHPNNIGYQI 
TKEWDASENVYFVPINDRLYKGINGKEGITESSNSQASITNDALFTGDHFHPNNIGYQI 

MSNAVMEKINETRKNWP 
MSNAVMEKINETRKNWP 
MSNAVMEKINETRKNWP 
MSNAVMEKINETRKNWP 
MSNAVMEKINETRKNWP 
MSNAVMEKINETRKNWP 
MSNAVMEKINETRKNWP 
MSNAVMEKINETRKNWP 
MSNAVMEKINETRKNWP 
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Table 27: C mparative Sequences relating to SAG1473 
(cell wall surface anchor family pr tein) 

SEQ ID NO. 2701: SAG1473 FROM THE 1169NT1 GBS TYPE V STRAIN 
{REVERSE COMPLEMENT) 

GATACAAGTGATAAGAATACTGACACGAGTGTCGTGACTACGACCTTATCTGAGGAGAAAAGATCAGATGA 
ACTAGACCAGTCTAGTACTGGTTCTTCTTCTGAAAATGAATCGAGTTCATCAAGTGAACCAGAAACAAATC 
CGTCAACTAATCCACCTACAACAGAACCATCGCAACCCTCACCTAGTGAAGAGAACAAGCCTGATGGTAGA 
ACGAAGAC AGAAATT GGC AAT AAT AAGGAT ATTT CTAGTGG AAC AAAAGT ATT AATTTC AGAAGATAGTAT 
T AAGAATT T T AGT AAAGC AAGT AGTGAT C AAGAAGAAGTGGATCGCGATGAATC ATC ATCTTC AAAAGC AA 
GTGATGGGAAAAAAGGCCACAGTAAGCCTAAAAAGGAA 

SEQ ID NO. 2702: SAG1473 FROM THE 18RS21 GBS TYPE II STRAIN 

GAT AC AAGT GAT AAG AATACT G ACACGAGT GTCGTGACTACGACCTT ATCTG AGGAGAAAAG AT C AGAT GA 
ACT AGACCAGTCT AGT ACT GGT TCTTCTTCTGAAAATGAATCG AGT T CAT CAAGT GAACC AGAAAC AAATC 
CGTCAACTAATCCACCTACAACAGAACCATCGCAACCCTCACCTAGTGAAGAGAACAAGCCTGATGGTAGA 
ACGAAGACAGAAATT GGCAAT AAT AAGGAT ATTTCT AGT GGAACAAAAGTATTAATTT CAGAAGAT AGT AT 
TAAGAATTTTAGTAAAGCAAGTAGTGATCAAGAAGAAGTGGATCGCGATGAATCATCATCTTCAAAAGCAA 
ATGATGGGAAAAAAGGCCACAGTAAGCCTAAAAAGGAA 

SEQ ID NO. 2703: SAG1473 FROM THE 2603 V/R GBS TYPE V STRAIN 

GATACAAGTG AT AAG AAT ACTGACACGAGT GT CGTGACTACGACCTTATCTGAGGAGAAAAG ATC AGAT GA 
ACTAGACCAGT CT AGT ACTGGT T CTTCT TCTGAAAAT GAATCGAGT T CAT C AAGTGAACC AGAAACAAATC 
CGTCAACTAATCCACCTACAACAGAACCATCGCAACCCTCACCTAGTGAAGAGAACAAGCCTGATGGTAGA 
ACGAAGACAGAAATTGGCAATAATAAGGATATTTCTAGTGGAACAAAAGTATTAATTTCAGAAGATAGTAT 
TAAGAATT T T AGT AAAGC AAGT AGT G ATC AAGAAGAAGTGGATCGCGATGAATC ATC ATCTT CAAAAGCAA 
ATGATGGGAAAAAAGGCCACAGTAAGCCTAAAAAGGAA 

SEQ ID NO. 2704: SAG1473 FROM THE 090 GBS TYPE la STRAIN 

GACCAGTCTAGTACTGGTTCTTCTTCTGAAAATGAATCGAGTTCATCAAGTGAACCAGAAACAAATCCGTC 
AACT AATCC ACCT AC AAC AGAACC AT C GC AACCCT C ACCT AGTGAAGAGAACAAGCCTGATGGT AG AACGA 
AGAC AG AAAT TGGCAAT AAT AAGGATATT TCTAGTGGAAC AAAAGT ATT AATT T CAGAAGAT AGT AT T AAG 
AAT TT T AGT AAAGCAAGT AGT GAT CAAGAAGAAGTGGATCGCGATGAAT C ATC ATCT TCAAAAGCAAATG A 
TGGGAAAAAAGGCCACAGTAAGCCTAAAAAGGAA 

SEQ ID NO. 2705: SAG1473 FROM THE A909 GBS TYPE la STRAIN 

GAT ACAAGTGAT AAG AAT ACTG AC ACGAGTGTCGTGACT ACG ACCT T AT CT GAGGAGAAAAGAT T AGATGA 
ACTAGACCAGTCTAGTACTGGTTCTTCTTCTGAAAATGAATCGAGTTCATCAAGTGAACCAGAAACAAATC 
CCTCAACTAATCCACCTACAACAGAACCATCGCAACCCTCACCTAGTGAAGAGAACAAGCCTGATGGTAGC 
ACGAAGACAGAAATTGGCAATAATAAGGATATTTCTAGTGGAACAAAAGTATTAATTTCAGAAGATAGTAT 
TAAGAATTT T AGT AAAGCAAGT AGT G ATC AAGAAGAAGTGGAT CGC GATGAATC ATC AT CT TCAAAAGC AA 
ATGATGAGAAAAAAGGCCACAGTAAGCCTAAAAAGGAA 

SEQ ID NO. 2706: SAG1473 FROM THE CJB110 GBS NONTYPEABIiE STRAIN 

GATACAAGTGATAAGAATACTGACACGAGTGTCGTGACTACGACCTTATCTGAGGAGAAAAGATCAGATGA 

ACTAGACCAGTCTAGTACTGGTTCTTCTTCTGAAAATGAATCGAGTTCATCAAGTG7\ACCAGAAACAAATC 

CGTCAACTAATCCACCTACAACAGAACCATCGCAACCCTCACCTAGTGAAGAGAACAAGCCTGATGGTAGA 

ACGAAGACAGAAATTGGCAATAAT AAGGAT ATTT CTAGTGGAAC AAAAGT ATT AAT T TC AGAAGATAGTAT 

TAAGAATTTTAGTAAAGCAAGTAGTGATCAAGAAGAAGTGGATCGCGATGAATCATCATCTTCAAAAGCAA 

ATGATGGGAAAAAAGGCCACAGTAAGCCTAAAAAGGAA 
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Table 27: C mparativ Sequences relating to SAG1473 
(cell wall surface anchor family protein) 

SEQ ID NO- 2707: SAG1473 FROM THE COHl GBS TYPE III STRAIN 
(REVERSE COMPLEMENT) 

GATACAAGTGATAAGAATACTGACACGAGTGTCGTGACTACGACCTTATCTGAGGAGAAAAGATCAGATGA 
ACTAGACCAGTCTAGTACTGGTTCTTCTTCTGAAAATGAATCAAGTTCATCAAGTGAACCAGAAACAAATC 
CCTCAACTAATCCACCTACAACAGAACCATCGCAACCCTCACCTAGTGAAGAGAACAAGCCTGATGGGAGC 
ACGAAGACAGAAATTGGCAATAATAAGGATATTTCTAGTGGAACAAAAGTATTAATTTCAGAAGATAGTAT 
TAAGAATTTTAGTAAAGCAAGTAGTGATCAAGAAGAAGTGGAACGCGATGAATCATCATCTTCAAAAGCAA 

ATGATGAGAAAAAAGGCCACAGTAAGCCTAAAAAGGAA 

SEO ID NO 2708: SAG1473 FROM THE H36b CBS TYPE Ii> STRAIN 

GATACAAGTGATAAGAATACTGACACGAGTGTCGTGACTACGACCTTATCTGAGGAGAAAAGATTAGATGA 
ACTAGACCAGTCTAGTACTGGTTCTTCTTCTGAAAATGAATCGAGTTCATCAAGTGAACCAGAAACAAATC 
CCTCAACTAATCCACCTACAACAGAACCATCGCAACCCTCACCTAGTGAAGAGAACAAGCCTGATGGTAGC 
ACGAAGACAGAAATTGGCAATAATAAGGATATTTCTAGTGGAACAAAAGTATTAATTTCAGAAGATAGTAT 
TAAGAATTTTAGTAAAGCAAGTAGTGATCAAGAArAAGTGGATCGCGATGAATCATCATCTTCAAAAGCAA 

ATGATGAGAAAAAAGGCCACAGTAAGCCTAAAAAGGAA 

SEO ID NO 2709: SAG1473 FROM THE JM910013 GBS TYPE VIII STRAIN 
GATACAAGTGATAAGAATACTGACACGAGTGTCGTGACTACGACCTTATCTGAGGAGAAAAGATTAGATGA 

ACTAGACCAGTCTAGTACTGGTTCTTCTTCTGAAAATGAATCGAGTTCATCAAGTGAACCAGAAACAAATC 

CCTCAACTAATCCACCTACAACAGAACCATCGCAACCCTCACCTAGTGAAGAGAACAAGCCTGATGGTAGC 

ACGAAGACAGAAATTGGCAATAATAAGGATATTTCTAGTGGAACAAAAGTATTAATTTCAGAAGATAGTAT 

TAAGAATTTTAGTAAAGCAAGTAGTGATCAAGAAGAAGTGGATCGCGATGAATCATCATCTTCAAAAGCAA 

atgatgagaaaaaaggccacag'taagcctaaaaaggaa 

SEQ ID NO. 2710: SAG1473 FROM THE M732 GBS TYPE III STRAIN 

gatacaagtgataagaatactgacacgagtgtcgtgactacgaccttatctgaggagaaaagatcagatga 
actagaccagtctagtactggttcttcttctgaaaatgaatcaagttcatcaagtgaaccagaaacaaatc 
cctcaactaatccacctacaacagaaccatcgcaaccctcacctagtgaagagaacaagcctgatgggagc 
acgaagacagaaattggcaataataaggatatttctagtggaacaaaagtattaatttcagaagatagtat 
taagaattttagtaaagcaagtagtgatcaagaagaagtggaacgcgatgaatcatcatcttcaaaagcaa 

atgatgagaaaaaaggccacagtaagcctaaaaaggaa 

seo id no 2711: sag1473 from the m781 gbs type iii strain 
gatacaagtgataagaatactgacacgagtgtcgtgactacgaccttatctgaggagaaaagatcagatga 

actagaccagtctagtactggttcttcttctgaaaatgaatcaagttcatcaagtgaaccagaaacaaatc 

cctcaactaatccacctacaacagaaccatcgcaaccctcacctagtgaagagaacaagcctgatgggagc 

acgaa^cagaaattggcaataataaggatatttctagtggaacaaaagtattaatttcagaac^tagtat 

taagaattttagtaaagcaagtagtgatcaagaagaagtggatcgcgatgaatcatcatcttcaaaagcaa 

atgatgagaaaaaaggccacagtaagcctaaaaaggaa 

SE02701 atacaagtgataagaatactgacacgagtgtcgtgactacgaccttatctgaggagaaa 
SE02702 ATA caagtgataagaatactgacacgagtgtcgtgactacgaccttatctgaggagaaa 

SEQ2703 atacaaGTGATAAGAATACTGACACGAGTGTCGTGACTACGACCTTATCTGAGGAGAAA 

So2705 atacaagtgataagaatactgacacgagtgtcgtgactacgaccttatctgaggagaaa 
SEQ2706 atacaagtgataagaatactgacacgagtgtcgtgactacgaccttatctgaggagaaa 
SEQ2707 atacaagtgataagaatactgacacgagtgtcgtgactacgaccttatctgaggagaaa 

SEQ2709 ATACAAGTGATAAGAATACTGACACGAGTGTCGTGACTACGACCTTATCTGAGGAGAAA 
SEQ2710 ATACAAGTGATAAGAATACTGACACGAGTGTCGTGACTACGACCTTATCTGAGGAGAAA 
SEQ2711 ATACAAGTGATAAGAATACTGACACGAGTGTCGTGACTACGACCTTATCTGAGGAGAAA 
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Table 27: C mparative Sequences relating to SAG147! 
(cell wall surface anch r family pr tein) 



173 



SEQ2701 
SEQ2702 
SEQ2703 
SEQ2704 
SEQ2705 
SEQ2706 
SEQ2707 
SEQ2709 
SEQ2710 
SEQ27X1 

SEQ2701 
SEQ2702 
SEQ2703 
SEQ2704 
SEQ2705 
SEQ2706 
SEQ2707 
SEQ2709 
SEQ2710 
SEQ2711 

SEQ2701 
SEQ2702 
SEQ2703 
SEQ2704 
SEQ2705 
SEQ2706 
SEQ2707 
SEQ2709 
SEQ2710 
SEQ271X 

SEQ2701 
SEQ2702 
SEQ2703 
SEQ2704 
SEQ2705 
SEQ2706 
SEQ2707 
SEQ2709 
SEQ2710 
SEQ2711 

SEQ2701 
SEQ2702 
SEQ2703 
SEQ2704 
SEQ2705 
SEQ2706 
SEQ2707 
SEQ2709 
SEQ2710 
SEQ27H 



GATC^GATGAACTAGACCAGTCTAGTACTGGTTCTTCTTCTGAAAATGAATCGAGTTCA 
GATCAGATGAACTAGACCAGTCTAGTACTGGTTCTTCTTCTGAAAATGAATCGAGTTCA 
GATCAGATGAACTAGACCAGTCTAGTACTGGTTCTTCTTCTGAAAATGAATCGAGTTCA 

GACCAGTCTAGTACTGGTTCTTCTTCTGAAAATGAATCGAGTTCA 

GATTAGATGAACTAGACCAGTCTAGTACTGGTTCTTCTTCTGAAAATGAATCGAGTTCA 
GATCAGATGAACTAGACCAGTCTAGTACTGGTTCTTCTTCTGAAAATGAATCGAGTTCA 
GATCAGATGAACTAGACCAGTCTAGTACTGGTTCTTCTTCTGAAAATGAATCAAGTTCA 
GATTAGATGAACTAGACCAGTCTAGTACTGGTTCTTCTTCTGAAAATGAATCGAGTTCA 
GATCAGATGAACTAGACCAGTCTAGTACTGGTTCTTCTTCTGAAAATGAATCAAGTTCA 
GATCAGATGAACTAGACCAGTCTAGTACTGGTTCTTCTTCTGAAAATGAATCAAGTTCA 

TCAAGTGAACCAGAAACAAATCCGTCAACTAATCCACCTACAACAGAACCATCGCAACCC 
TCAAGTGAACCAGAAACAAATCCGTCAACTAATCCACCTACAACAGAACCATCGCAACCC 
TCAAGTGAACCAGAAACAAATCCGTCAACTAATCCACCTACAACAGAACCATCGCAACCC 
TCAAGTGAACCAGAAACAAATCC GT CAACTAAT CCACCTACAACAGAACC ATCGCAACCC 
T C AAGTGAACCAGAAACAAATCCCTCAACTAATCCACCTACAAC AGAACCAT CGCAACGC 
TCAAGTGAACCAGAAACAAATCCGTCAACTAATCCACCTACAACAGAACCATCGCAACCC 
TCAAGTGAACCAGAAACAAATCCCTCAACTAATCCACCTACAACAGAACCATCGCAACCC 
TCAAGTGAACCAGAAACAAATCCCTCAACTAATCCACCTACAACAGAACCATCGCAACCC 
TCAAGTGAACCAGAAACAAATCCCTCAACTAATCCACCTACAACAGAACCATCGCAACCC 
TCAAGTGAACCAGAAACAAATCCCT CAACTAAT CCACCT ACAACAGAACCATCGCAACCC 

TCACCTAGTGAAGAGAACAAGCCTGATGGTAGAACGAAGACAGAAATTGGCAATAATAAG 
TCACCTAGTGAAGAGAACAAGCCTGATGGTAGAACGAAGACAGAAATTGGCAATAATAAG 
TCACCTAGTGAAGAGAACAAGCCTGATGGTAGAACGAAGACAGAAATTGGCAATAATAAG 
TCACCTAGTGAAGAGAACAAGCCTGATGGTAGAACGAAGACAGAAATTGGCAATAATAAG 
TCACCTAGTGAAGAGAACAAGCCT GAT GGT AGCACGAAGACAGAAATTGGCAATAAT AAG 
TCACCTAGTGAAGAGAACAAGCCTGATGGTAGAACGAAGACAGAAATTGGCAATAATAAG 
TCACCTAGTGAAGAGAACAAGCCTGATGGGAGCACGAAGACAGAAATTGGCAATAATAAG 
TCACCT AGTGAAGAGAACAAGC CT GAT GGT AGCACGAAGACAGAAATTGGCAATAAT AAG 
TCACCTAGTGAAGAGAACAAGCCTGATGGGAGCACGAAGACAGAAATTGGCAATAATAAG 
TCACCTAGTGAAGAGAACAAGCCTGATGGGAGCACGAAGACAGAAATTGGCAATAATAAG 

GATATTTCT AGTGGAACAAAAGT AT TAATT TCAGAAGATAGT AT T AAGAATTT T AGTAAA 
GAT ATT TCT AGTGGAACAAAAGTAT TAAT T TCAGAAGATAGTATTAAGAATTTT AGTAAA 
GATATTTCT AGTGGAACAAAAGT ATT AATTTCAGAAGATAGTATTAAGAATT T T AGTAAA 
GATATTTCTAGTGGAACAAAAGTATTAATTTCAGAAGATAGTATTAAGAATTTTAGTAAA 
GATAT T TCT AGTGGAACAAAAGT ATTAATT TCAGAAGATAGT ATT AAGAATTTT AGTAAA 
GAT AT TTCT AGTGGAACAAAAGT AT TAATT TCAGAAGATAGT ATTAAGAAT T T T AGTAAA 
GATATTTCTAGTGGAACAAAAGT ATTAATTTCAGAAGAT AGTATTAAGAATT T T AGTAAA 
GATATTTCT AGTGGAACAAAAGTATTAATTTCAGAAGATAGTATTAAGAATTTTAGTAAA 
GATATTTCTAGTGGAACAAAAGTATTAATTTCAGAAGATAGTATTAAGAATTTTAGTAAA 
GATATTTCTAGTGGAACAAAAGTATTAATTTCAGAAGATAGTATTAAGAATTTTAGTAAA 

GCAAGTAGTGATCAAGAAGAAGTGGATCGCGATGAATCATCATCTTCAAAAGCAAGTGAT 
GCAAGTAGTGATCAAGAAGAAGTGGATCGCGATGAATCATCATCTTCAAAAGCAAATGAT 
GCAAGTAGTGATCAAGAAGAAGT GGATCGCGATGAATC ATCATCT TCAAAAGCAAATGAT 
GCAAGTAGTGATCAAGAAGAAGTGGATCGCGATGAATCATCATCTTCAAAAGCAAATGAT 
GCAAGTAGTGATCAAGAAGAAGTGGATCGCGATGAATCATCATCTTCAAAAGCAAATGAT 
GCAAGTAGTGATCAAGAAGAAGTGGATCGCGATGAATCATCATCTTCAAAAGCAAATGAT 
GCAAGTAGTGATCAAGAAGAAGTGGAACGCGATGAATCATCATCTTCAAAAGCAAATGAT 
GCAAGTAGTGATCAAGAAGAAGTGGATCGCGATGAATCATCATCTTCAAAAGCAAATGAT 
GCAAGTAGTGATCAAGAAGAAGTGGAACGCGATGAATCATCATCTTCAAAAGCAAATGAT 
GCAAGTAGTGATCAAGAAGAAGTGGATCGCGATGAATCATCATCTTCAAAAGCAAATGAT 



SEQ27 0 1 GGGAAAAAAGGCCACAGTAAGCCT AAAAAGGAA- 
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SEQ2702 
SEQ2703 
SJSQ2704 
SEQ2705 
SEQ2706 
SEQ2707 
SEQ2709 
SEQ2710 
SEQ2711 

SEQ2701 
SEQ2702 
SEQ2703 
SEQ2704 
SEQ2705 
SEQ2706 
SEQ2707 
SEQ2709 
SEQ2710 
SEQ2711 

SEQ2701 
SEQ2702 
SEQ2703 
SEQ2704 
SEQ2705 
SEQ2706 
SEQ2707 
SEQ2709 
SEQ2710 
SEQ2711 

SEQ2701 
SEQ2702 
SEQ2703 
SEQ2704 
SEQ2705 
SEQ2706 
SEQ2707 
SEQ2709 
SEQ2710 
SEQ2711 

SEQ2701 
SEQ2702 
SEQ2703 
SEQ2704 
SEQ2705 
SEQ2706 
SEQ2707 
SEQ2709 
SEQ2710 
SEQ2711 



, & Q H- * J & fa >' S3 £* 

Table 27: Comparative Sequences relating t SAG1473 
(cell wall surface anchor family pr tein) 

GGGAAAAAAGGCCACAGTAAGCCTAAAAAGGAA 

GGGAAAAAAGGCCACAGTAAGCCTAAAAAGGAA 

GGGAAAAAAGGCCACAGTAAGCCTAAAAAGGAA 

GAGAAAAAAGGCCACAGTAAGCCTAAAAAGGAA 

GGGAAAAAAGGCCACAGTAAGCCTAAAAAGGAA 

GAGAAAAAAGGCCACAGTAAGCCTAAAAAGGAASGATACAAGTGATAAGAATACTGACAC 

GAGAAAAAAGGCCACAGTAAGCCTAAAAAGGAA 

GAGAAAAAAGGCCACAGTAAGCCTAAAAAGGAA 

GAGAAAAAAG GCCACAGT AAGCCTAAAAAGGAAT ABCMARAT VSTNCS RATNGT S AGCWA 

AGTGTCGTGACTACGACCTTATCTGAGGAGAAAAGATTAGATGAACTAGACCAGTCTAG 
TRACANCHRAMYRTN 

ACTGGTTCTTCTTCTGAAAATGAATCGAGTTCATCAAGTGAACCAGAAACAAATCCCTC 
ACTAATCCACCTACAACAGAACCATCGCAACCCTCACCTAGTGAAGAGAACAAGCCTGA 
GGTAGCACGAAGACAGAAATTGGCAATAATAAGGATATTTCTAGTGGAACAAAAGTATT 



SEQ2701 
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Table 27: Comparative Sequences relating to SAG147 
(cell wall surface anchor family pr tein) 



t 

47^ 



SEQ2702 
SEQ2703 
SEQ2704 
SEQ2705 
SEQ2706 
SEQ2707 
SEQ2709 
SEQ2710 
SEQ2711 

SEQ2701 
SEQ2702 
SEQ2703 
SEQ2704 
SEQ2705 
SEQ2706 
SEQ2707 
SEQ2709 
SEQ2710 
SEQ2711 

SEQ2701 
SEQ2702 
SEQ2703 
SEQ2704 
SEQ2705 
SEQ2706 
SEQ2707 
SEQ2709 
SEQ2710 
SEQ2711 



ATTTCAGAAGATAGTATTAAGAATTTTAGTAAAGCAAGTAGTGATCAAGAARAAGTGGA 



CGCGATGAATCATCATCTTCAAAAGCAAATGATGAGAAAAAAGGCCACAGTAAGCCTAA 



AAGGAA 



>SEQ ID NO 2750 : 4_JL169NT frame: 1 

DTSDKNTDTSWTTTLSEEKRSDELDQSSTGSSSENESSSSSEPETNPSTNPPTTEPSQP 
SPSEENKPDGRTKTEIGNNKDISSGTKVLISEDSIKNFSKASSDQEEVDRDESSSSKASD 

GKKGHSKPKKE 

>SEQ ID NO 2751:4_18RS21 frame: 1 

DTSDKNTDTSVVTTTLSEEKRSDELDQSSTGSSSENESSSSSEPETNPSTNPPTTEPSQP 
SPSEENKPDGRTKTEIGNNKDISSGTKVLISEDSIKNFSKASSDQEEVDRDESSSSKAND 

GKKGHSKPKKE 

>SEQ ID NO 2752 :4_2 603 frame: 1 

DTSDKNTDTSWTTTLSEEKRSDELDQSSTGSSSENESSSSSEPETNPSTNPPTTEPSQP 
SPSEENKPDGRTKTEIGNNKDISSGTKVLISEDSIKNFSKASSDQEEVDRDESSSSKAND 

GKKGHSKPKKE 

>SEQ ID NO 2753 :4_ 090 frame: 1 

DQSSTGSSSENESSSSSEPETNPSTNPPTTEPSQPSPSEENKPDGRTKTEIGNNKDISSG 
TKVLISEDSIKNFSKASSDQEEVDRDESSSSKANDGKKGHSKPKKE 

>SEQ ID NO 2754:4_A909 frame: 1 

DTSDKNTOTSWTTTLSEEKRLDELDQSSTGSSSENESSSSSEPETNPSTNPPTTEPSQP 
SPSEENKPDGSTKTEIGNNKDISSGTKVLISEDSIKNFSKASSDQEEVDRDESSSSKAND 

EKKGHSKPKKE 

>SEQ ID NO 2755 : 4JCJB110 frame: 1 

DTSDKNTDTSWTTTLSEEKRSDELDQSSTGSSSENESSSSSEPETNPSTNPPTTEPSQP 
SPSEENKPDGRTKTEIGNNKDISSGTKVLISEDSIKNFSKASSDQEEVORDESSSSKAND 

GKKGHSKPKKE 



>SEQ ID NO 2756:4_COHl frame: 1 
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Table 27: C mparative Sequences relating t SAG147: 
(cell wall surface anch r family protein) 

DTSDKNTDTSWTTTLSEEKRSDELDQSSTGSflSENESSSSSEPETNPSTNPPTTEPSQP 
SPSEENKPDGSTKTEIGNNKDISSGTKVLISEDSIKNFSKASSDQEEVERDESSSSKAND 

EKKGHSKPKKE 

>SEQ ID NO 2757:4 H36B frame: 1 

DTSDKNTDTSWTTTLSEEKRIiDELDQSSTGSSSENESSSSSEPETNPSTNPPTTEPSQP 
SPSEENKPDGSTKTEIGNNKDISSGTKVLISEDSIKNFSKASSDQEXVDRDESSSSKAND 

EKKGHSKPKKE 

>SEQ ID NO 2758:4 JM9130013 frame: 1 

DTSDKNTDTSWTTTLSEEKRLDELDQSSTGSSSENESSSSSEPETNPSTNPPTTEPSQP 
SPSEENKPDGSTKTEIGNNKDISSGTKVLISEDSIKNFSKASSDQEEVDRDESSSSKAND 

EKKGHSKPKKE 

>SEQ ID NO 2759:4 M732 frame: 1 

DTSDKNTDTSWTTTLSEEKRSDELDQSSTGSSSENESSSSSEPETNPSTNPPTTEPSQP 
SPSEENKPDGSTKTEIGNNKDISSGTKVLISEDSIKNFSKASSDQEEVERDESSSSKAND 

EKKGHSKPKKE 

>SEQ ID NO 2760:4 M781 frame: 1 

DTSDKNTDTSWTTTLSEEKRSDELDQSSTGSSSENESSSSSEPETNPSTNPPTTEPSQP 
SPSEENKPDGSTKTEIGNNKDISSGTKVLISEDSIKNFSKASSDQEEVDRDESSSSKAND 

EKKGHSKPKKE 
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SEQ2750 
SEQ2751 
SEQ2752 
SEQ2753 
SEQ2754 
SEQ2755 
SEQ2756 
SEQ2757 
SEQ2758 
SEQ2759 
SEQ2760 

SEQ2750 
SEQ2751 
SEQ2752 
SEQ2753 
SEQ2754 
SEQ2755 
SEQ2756 
SEQ2757 
SEQ2758 
SEQ2759 
SEQ2760 

SEQ2750 
SEQ2751 
SEQ2752 
SEQ2753 
SEQ2754 
SEQ2755 
SEQ2756 
SEQ2757 
SEQ2758 
SBQ2759 
SEQ2760 



TSDKNTDTSWTTTLSEEKRSDELDQSSTGSSSENESSSSSEPETNPSTNPPTTEPSQP 
TSDKNTDTSWTTTLSEEKRSDELDQSSTGSSSENESSSSSEPETNPSTNPPTTEPSQP 
TSDBCNTDTSWTTTLSEEKRSDELDQSSTGSSSENESSSSSEPETNPSTNPPTTEPSQP 

DQSSTGSSSENESSSSSEPETNPSTNPPTTEPSQP 

TSDKNTDTSWTTTLSEEKRLDELDQSSTGSSSENESSSSSEPETNPSTNPPTTEPSQP 
TSDKNTDTSWTTTLSEEKRSDELDQSSTGSSSENESSSSSEPETNPSTNPPTTEPSQP 
TSDKNTDTSWTTTLSEEKRSDELDQSSTGSSSENESSSSSEPETNPSTNPPTTEPSQP 
TSDKNTDTSWTTTLSEEKR1.DELDQSSTGSSSENESSSSSEPETNPSTNPPTTEPSQP 
TSDKNTDTSWTTTLSEEKRLDELDQSSTGSSSENESSSSSEPETNPSTNPPTtEPSQP 
TSDKNTDTSWTTTLSEEKRSDELDQSSTGSSSENESSSSSEPETNPSTNPPTTEPSQP 
TSDKNTDTSWTTTLSEEKRSDELDQSSTGSSSENESSSSSEPETNPSTNPPTTEPSQP 

SPSEENKPDGRTKTEIGNNKDISSGTKVLISEDSIKNFSKASSDQEEVDRDESSSSKASD 
SPSEENKPDGRTKTEIGNNKDISSGTKVLISEDSIKNFSKASSDQEEVDRDESSSSKAND 
SPSEENKPDGRTKTEIGNNKDISSGTKVLISEDSIKNFSKASSDQEEVDRDESSSSKAND 
SPSEENKPDGRTKTEIGNKKDISSGTKVLISEDSIKNFSKASSDQEEVDRDESSSSKAND 
SPSEENKPDGSTKTEIGNNKDISSGTKVLISEDSIKNFSKASSDQEEVDRDESSSSKAND 
SPSEENKPDGRTKTEIGNNKDISSGTKVLISEDSIKNFSKASSDQEEVDRDESSSSKAND 
SPSEENKPDGSTKTEIGNNKDISSGTKVLISEDSIKNFSKASSDQEEVERDESSSSKAND 
SPSEENKPDGSTKTEIGNNKDISSGTKVLISEDSIKNFSKASSDQEXVDRDESSSSKAND 
SPSEENKPDGSTKTEIGNNKDISSGTKVLISEDSIKNFSKASSDQEEVDRDESSSSKAND 
SPSEENKPDGSTKTEIGNNKDISSGTKVLISEDSIKNFSKASSDQEEVERDESSSSKAND 
SPSEENKPDGSTKTEIGNNKDISSGTKVLISEDSIKNFSKASSDQEEVDRDESSSSKAND 

KKGHSKPKKE 
KKGHSKPKKE 
KKGHSKPKKE 
KKGHSKPKKE 
KKGHSKPKKE 
KKGHSKPKKE 
KKGHSKPKKE 
KKGHSKPKKE 
KKGHSKPKKE 
KKGHSKPKKE 
KKGHSKPKKE 
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Tabl 28: Comparative Sequences relating to SAG155: 
(conserved hypothetical protein) 

SEQ ID NO. 2801: SAG1552 FROM THE 1169WT1 GBS TYPE V STRAIN 

TTTCTTCTTAAAGGTCflM 

TCCTTA6CAGGTTATCATCACAACGATTTTCCTATTACTCARAAAACGTATCGTGAGTGGTTCCATTTAATTTCCAAC 

mgggggStactgtaagagtc^ 

gat^ttatagggggtatttaaaacgag 
SSttttggtagccgtcat^ 

2a?^attagtttttcaaactcacc^caacagacccttttcgt 
patttccatcctcg^tacaaggattatctattatttgataaagagaatatgagtaaaga 

aaSggISS 

CAAGGTTATGGTTTATTAGGCTTTAAAAACGCAAAACATCATTATCAAGTTGATGGTAAAAGAGGCAAAGGAGAGTGG 

aaacatcctctg 

SLSaSgcaS 

Sa^ctaaaagI^ 

StI^St^aaat^ 

SaaaaaScacI^^^ 

tttttaaaagactcctattatagtatttaagaaagaa 

SIgStSttaaaagaaaaScaagaa^ 
(Xtcttgt^gttaaaggag^aga^ 

ttttaotatgcxttatatcacca 

cacabaat^cctgttctagtc^ 
St^cgattaatgaaaaa^caaggtcagcgtttact^ 

======= 

AA 
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Table 28: Comparative Sequences relating to SAG15S^ 
(conserved hypothetical protein) 

SEQ ID NO. 2B04: SAG1552 FROM THE 2603 V/R GBS TYPE V STRAIN 
(REVERSE COMPLEMENT) 

TATTAAAAGAAAATACAAGAACTAACTTTGTTGTTAAAGGTGATACTGTACTTCACAAGCCCACCAATAAACCTTTTG 

TTGTTAAAGGAGTAGACGTTGAGTCTTCCTTAGCGGGTTATCATCACAACGATTTTCCTATTACTCAAAAAACGTATC 

GTGAATGGTTCCATTTAATTTCCAACATGGGGGCAAATACTGTAAGAGTCAAGGTACCGATGAATGTTGCATTTTACG 

ATGCCTTATATCACCACAACAAAGCATCAAAGAGGCCACTGTATTTGTTGCAAGGAATACGTATAGATTCTTATCGCA 

AT AATGCTTCT ATAAC AGCT TTT AATGAT AATTATAGGGGGT ATTTAAAACGAGAAGCAAAAGGCGT TGTGGAT AT TC 

TCCATGGGCGTAAGCAAGTATGGAATACTGATTTGGGTAGCCGTCATTATCATTATGATCTTAGTCCTTGGGTACTTG 

GTTATGTCGTAGGGGATGATTGGAATAGTGGTACTGTCGCTTATACTAATCATCAAGAGAAAAAAACGCAATATAAAG 

GACGTTATTTTAAAACTTCTGTGGCAGCTAATCCATTTGAGGTCATGCTAGCTCAAGTAATGGATGAATTGACACATT 

ATGAGACAGCTAAATATGGTTGGCAACATTTGATTAGTTTTTCAAACTCACCAACAACAGACCCTTTTCATTATCGAA 

AACCATTTGAGGCACAGGCTCCTAAATACGTACAACTAAATGTAGAAAATATTCAAGCTAATTCAAATGTTAAAGCAG 

GTATGTTTGCAGCATATAAAGCTATTGATTTCCATCCTCGATACAAGGATTATCTATTATTTGATAAAGAGAATATCA 

GTAAAGAAGATAGACAAAAGATTAAAGAACTTTCTTTGTCACAGGGATACGTTAAACTGCTAAATGCTTATCACAAAA 

TCCCTGTTCTAGTCACGGGTT ATGGCTATTCGACAGCGAGAGGT AT TGCCCAAAAAGAAATT GATAAACGTCCT CTGC 

CGATT AATGAAAAAGAACAAGGTCAGCGTTTACT AGAAGATT ATGAATCTTT TAT ATCATCCGGTAGT T TTGGAGCGA 

CTATCAATGCATGGCAAGACGATTGGAATGCAAGGGCGTGGAATACATCTTTCGCCACAAATAAACATAGTCAATTCC 

TATGGGGGGATGCACAAGTATTTAATCAAGGTTATGGTTTATTAGGCTTTAAAAACGCAAAACATCATTATCAAGTTG 

ATGGTAAAAGAGGCAAAGGAGAGTGGAAACATCCTCTGATGACTAGTGCAACAGGAGATGACTTATATGCTAGCAGTG 

ATGAAAGCTATCTCTACCTTGCGATTAAAACAAAACCTGAAAAACTAAAAGAAAAACGATTATTACCAATAGATATTA 

CACCAAAATCTGGTAGTAGAAAAATGAATGGTAGTAAGGTCACATTTTCTAAATCTAGTGACTTTGTATTGTCTATTG 

ATCCAAATGGCAAGTCTGAATTATTTGTCCAAGAGCGCTATAATGCCTTAAAAGCGAACTATCTTCGACAGCTTAACG 

GTAAAGATTTTTATGCTTTCCCACCT^AAGAAGAACAGTAGTAATTTTGAGCAGATAAATATGGTATTGAGAAATACAA 

AGATTGTTGAAGACATGGAAAAAGTAAAAGC2UYCAGAGAGGTTCTTACCA 

GAACAACTGATAGGCACCAAAAAACATTTGATTCACAAACAGATATTTCGTTTpGAAAGGACTTTATAGAGGTCAGAA 
TTQCGTGGCAGTTGTTGAATTTTTCTGATCCATCATCTCAAAAAATTCACGATGATTACTTTAAACATTATGGTGTGA 
AGGAGTTAGAAATTGAGAGCATTGCTTTAGGATTAGGTGCT7VATAGCAAAGAAAACACACTGATAAAGATGGCAGATT 
ATCGTTTGAAAAATTGGGAGAGACCCGATACCAAAACCTTTTTAAAAGACTCCTATTATAGTATTAAGAAAGAATGGT 

CTAAAGAAAGAGAGAGAACATATGGTCCA 

SEQ ID NO. 2805: SAG1552 FROM THE A909 GBS TYPE la STRAIN 
(REVERSE COMPLEMENT) 

AAGGGCTTATTAAAAGAAAATACAAGAACTAACTTTGTTGTTAAAGGTGATACTGTACTTCACAAGCCCACCAATAAA 
CCTTTTGTTGTTAAAGGAGTAGACGTTGAGTCTTCCTTAGCGGGTTATCATCACAACGATTTTCCTATTACTCAAAAA 
ACGTATCGTGAATGGTTCCATTTAATTTCCAACATGGGGGCAAATACTGTAAGAGTCAAGGTACCGATGAATGTTGCA 
TTTTACGATGCCTTATATCACCACAACAAAGCATCAAAGAGGCCACTGTATTTGTTGCAAGGAATACGTATAGATTCT 
TATCGCAATAATGCTTCTATAACAGCTTTTAATGATAATTATAGGGGGTATTTAAAACGAGAAGCAAAAGGCGTTGTG 
GATATTCTCCATGGGCGTAAGCAAGTATGGAATACTGATTTGGGTAGCCGTCATTATCATTATGATCTTAGTCCTTGG 
GTACTTGGTTATGTCGTAGGGGATGATTGGAATAGTGGTACTGTCGCTTATACTAATCATCAAGAGAAAAAAACGCAA 
TATAAAGGACGTTATTTTAAAACTTCTGTGGCAGCTAATCCATTTGAGGTCATGCTAGCTCAAGTAATGGATGAATTG 
ACACATTATGAGACAGCTAAATATGGTTGGCAACATTTGATTAGTTTTTCAAACTCACCAACAACAGACCCTTTTCAT 
TATCGAAAACCATTTGAGGCACAGGCTCCTAAATACGTACAACTAAATGTAGAAAATATTCAAGCTAATTCAAATGTT 
AAAGCAGGTATGTTTGCAGCATATAAAGCTATTGATTTCCATCCTCGATACAAGGATTATCTATTATTTGATAAAGAG 
AATATCAGTAAAGAAGATAGACAAAAGATTAAAGAACTTTCTTTGTCACAGGGATACGTTAAACTGCTAAATGCTTAT 
CACAAAATCCCTGTTCTAGTCACGGGTTATGGCTATTCGACAGCGAGAGGTATTGCCCAAAAAGAAATTGATAAACGT 
CCT CTGCCGATTAATGAAAAAGAACAAGGTCAGCGTTTACTAGAAG ATT ATGAATCTT TT ATATCATCCGGT AGTTT T 
GGAGCGACTATCAATGCATGGCAAGACGATTGGAATGCAAGGGCGTGGAATACATCTTTCGCCACAAATAAACATAGT 
CAATTCCTATGGGGGG ATGCACAAGTATTTAATCAAGGTTATGGT TT AT T AGGCTTTAAAAACGCAAAACATCAT TAT 
CAAGTTGATGGTAAAAGAGGCAAAGGAGAGTGGAAACATCCTCTGATGACTAGTGCAACAGGAGATGACTTATATGCT 
AGCAGTGATGAAAGCTATCTCTACCTTGCGATTAAAACAAAACCTGAAAAACTAAAAGAAAAACGATTATTACCAATA 
GATATTACACCAAAATCTGGTAGTAGAAAAATGAATGGTAGTAAGGTCACATTTTCTAAATCTAGTGACTTTGTATTG 
TCTATTGATCCAAATGGCAAGTCTGAATTATTTGTCCAAGAGCGCTATAATGCCTTAAAAGCGAACTATCTTCGACAG 
CTTAACGGTAAAGATTTTTATGCTTTCCCACCAAAGAAGAACAGTAGTAATTTTGAGCAGATAAATATGGTATTGAGA 
' AATACAAAGATTGTTGAAGACATGGAAAAAGTAAAAGCAACAGAGAGGTTCTTACCAACTCATCCTACTGGTCTTCTC 
AAAACAGGAACAACTGATAGGCACCAAAAAACATTTGATTCACAAACAGATATTTCGTTTGGAAAGGACTTTATAGAG 
GTCAGAATTCCGTGGCAGTTGTTGAATTTTTCTGATCCATCATCTCAAAGAATTCACGATGATTACTTTAAACATTAT 
GGTGTGAAGGAGTTAGAAAATTGAGAGCCATTGCTTTAGGATTAGGTGCTAATAGCAAAGAAAACACACTGATAAAGA 
TGGCAGATTATCGTTTGAAAAATTGGGAGAGACCCGATACCAAAACCTTTTTAAAAGA 
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Table 28: Comparative Sequences relating to SAG1552 
(conserved hypothetical protein) 

SEO ID NO 2806- SAG1552 FROM THE CJB110 GBS NONTYPEABLE STRAIN 
TATTACTTTGATGGTAGTTTGTATTTACCAAAGGGCTTATTAAAAGAZVAATACAAGAACTAACTTTGTTGTTAAAGGT 
GATACTGTACTTCACAAGCCCACCAATAAACCTTTTGTTGTTAAAGGAGTAGACGTTGAGTCTTCCTTAGCGGGTTAT 
CATCACAACGATTTTCCTATTACTCAAAAAACGTATCGTGAATGGTTCCATTTAATTTCCAACATGGGGGCAAATACT 

GTAAGAGTCAAGGTACCGATGAATGTTG^ 

TATTTGTTGCAAGGAATACGTATAGATTCTTATCGCAATAATGCTTCTATAACAGCTTTTAATGATAATTATAGGGGG 
TATTTAAAACGAGAAGCAAAAGGCGTTGTGGATATTCTCCATGGGCGTAAGCAAGTATGGAATACAGATTTTGGTAGC 
CGTCATTATCATTATGATCTTAGTCCTTGGGTACTTGGTTATGTCGTAGGGGATGATTGGAATAGTGGTACTGTCGCT 
TATACTAATCATCAAGAGAAAAAARCGCAATATAAAGGACGTTATTTTAAAACTTCTGTGGCAGCTAATCCATTTGAG 
GTCATGCTAGCTCAAGTAATGGATGAATTGACACATTATGAGACAGCTAAATATGGTTGGCAACATTTGATTAGTTTT 
TCAAACTCACCAACAACAGACCCTTTTCATTATCGAAAACCATTTGAGGCACAGGCTCCTAAATACGTACAACTAAAT 

S^Jattcaagc^^^ 

TACAAGGATTATCTATTATTTGATAAAGAGAATATCAGTAAAGAAGATAGACAAAAGATTAAAGAACTTTCTTTGTCA 

Jagg^S™^ 

ggtattgcccaaaaKgaaattgataaacgtcctc 

tatgaatcttttatatcatccggtagttttggagcgactatcaatgcatggcaagacgattggaatgcaa^ 

AATACATCTTTCGCCACAAATAAACATAATCAATTCCTATGGGGGGATC 

TTAGGCTTTAAAAACGCAAAACATCATTATCAAGTTGATGGTAAAAGAGGCAAAGGAGAGTGGAAACATCCTC 
ACTAGTGCAACAGGAGATGACTTATATGCTA^^ 

ACATTTTCTAAATCTAGTGACTTTGTATTGTCTATTGATCCAAATGGCAAGTCTGAATTATTTGTCCAAGAGCGCTAT 

aSgcctSaaaagcgaaSatSt 

AATTTTGAGCAGATAAATATGGTATTGAGAAATACAAAGATTGTTGAAGACATGGAAAAAGTAAAAGCAACAGAGAGG 
GATATTTCGTTTGGAAAGGACTTTATAGAGGTCAGAATTCCGTGGCAGTTGTTGAATTTTTC 

AAAATTCACGATGATTACTTTAAACATTATGGTGTGAAGGAGTTAGAAATTGAGAGCATTGCTTTAGGATTAGGTGCT 

aatSaaag^ 

TTAAAAGACTCCTATTATGTATTAAGAAAGA 

SEQ ID NO. 2807: SAG1552 FROM THE COH1 GBS TYPE III STRAIN 

TTTACCACAGGGCTTATTAAAAGAAARTACAAGAACTAACTTTGTTGTTAAAGGTGATACTGTACTTCACAAGCCCAC 
CAATAAACCTTTTGTTGTTAA^ 

5c^aaaaaStatcgtgaI^ttccatttaatttccaa^ 

agattcttatcgcaataatgcttctataacagcttttaatgataattatagggggtatttaaaacgagaagcaaaagg 

tccttgggtacttggttatgtcgtaggggatgattggaatagtggtactgtcgcttatactaatcatcaagagaaa^ 
aacgcaatataaaggacgttattttaaaacttctgtggcagctaatccatttgaggtcatgctagctcaagtaatgga 

TTT^CATTATCGAAAACCATTTGAGGCACAGGCTCCTAAATACGTACA 

aaatctSgc^ 
tI^gagaatatcagt^ 

tgcttatcacaaaatccctgttctagtc^cgggttatggctattcg^^ 
taaacgtcctctgccgattaatgaaaaagaacaaggtcagc 

tagttttggagcgactatcaatgcatggcaagacgattggaatgcaagggcgtggaatacatctttcgccacaa^ 
acatagtcaattcctatggggggatgcacaagtatttaatcaaggttatggtttattaggctttaaaaato^ 

tcattatcaag^ 
aSSSgcagtSatgaaa^aS 

tgtattgtctattgatccaaatggcaagtctgaattatttgtccaa^ 
tcgacagcttaacggtaaagatttttatgctttcccaccaaa<^^ 
attgagaaatacaaaga^^ 

tcttctcaaaacaggaacaactgatag^accaaaaaacatttgattcacaaccagatatttcgtttgg^ 
tatagaggtcagaattccgtggcagt^ 

acattatggtgtgaaggagttagaaattgagagcattgctttaggattaggtgctaatagcaaagaaaacacactgat 
aaagatggcagattatcgtttgaaaaattgggagagacccgataccaaaacctttttaaaagact 
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Table 28: Comparative Sequences relating to SAG155: 
(conserved hypothetical protein) 

<JEO in NO 2B08> SAG1552 FROM THE H36b GBS TYPE lb STRAIN 

^GG^CTTATTAAAAGAAAATACAAGAACTAACTTTGTTGTTAAAGGTGATACTGTACTTCACAAGCCCACCAATAA 
ACCTTTTGTTGTTAAAGGAGTAGACGTTGAGTCTTCCTTAGCGGGTTATCATC^ 

attttacStgccttatatcaccacaac^ 

ScGCAA^AATGCTTCTATAACAGC 

gga?a??ctccatgggcgtaagc^ 

GCTACTTGGTTATGTCGTAGGGGATGATGGACATAGTGGTACTGTCGCTTTATACTAATCATCA 

t^acIcmtatgagaSgctaaatatggttggcaaca^ 
ca??a?SaaaaccatSgaggcacaggctcctaaatacgtacaa^ 
g??aaagcag^atgtSgcagcatataaag^ 
gagaatatcagt^ 

^ScSSc^gttctagtcacgggttatggctact 

CCTCC^CTGCCGATTAATGAAAAAGAACAAGGTCAGCGTTTACTAGAAG 

SgSg?gac?aStgc^ 
aScaattcctatgg^ggatgcacaag™ 

TATrac^TTGATGGTAAAAG 

gSScaSgmSaagctatS 

SSaSSScca^ggcS 

SSaatggIS^ 
SSaSS^gaagacatggaaaa^ 

ctc^aaacaggaacaactgataggcaccaaaaaacatttgattcacaracagatatttcgtttggaaaggactttat 
SSt^Saa?tccgtggcagttgttgaatttttctga^ 

atggcagattatcgtttgaaaaattgggagaga 

SEO ID NO 2809: SAG1552 FROM THE JM9130013 GBS TYPE VIII STRAIN 

ACTTTGTTGTTAAAGGTGATACTGTACTTCACAAGCCC^CCAATAAACCTTTT 

cSSttagcg^ 

SSggggcaaatactgta^^^ 

cIScaaaXgS^ 

AT^AT^ATTATAGGGGGTATTTAAAACGAGAAGCAAAAGGCGTT 
ATAGTGGTACTGTCGCTTATACTAATC 

cSSaatcca?ttcag^ 

aacatttgattagtttttcaaactcaccaacaacagacccttttcattatc^ 
aaScgtacaactaaatgtagaaratatt^ 

agStttactagaagattatgaatc^ 

atc^ggttatggtttattaggctttaaaaacgcaaaacatcattatcaggttga^ 

Jgaaacatcctc^^ 
t^aaaISaaaS^ 
tgaa^ctagtaaggtcacattt^ 

CAAAGAAGAACAGTAGTAATTTTGAGCAGATAAATATGGTATTGAGAAA 

SaaS^aS 
ctt^aggattaggtgctaatagcaaagaa 

CCGATACCAAAACCTTTTTARAAGACTCCTATTATAGTATTAAGAAAG 



4 



t 



Table 28: Comparativ Sequences relating to SAG15S 
(conserved hypothetical protein) 

SEQ ID NO. 2810: SAS1552 FROM THE M732 GBS TYPE III STRAIN 

TACAAGAACTAACTTTGTTGTTAAAGGTGATACTGTACTTCACAAGCCCACCAATAAACCTTTTGTTGTTAAAGGAGT 
AGACGTTGAGTCTTCCTTAGCGGGTTATCATCACAACGATTTTCCTATTACTCAAAAAACGTATCGTGAATGGTTCCA 
TTTAATTTCCAACATGGGGGCAAATACTGTAAGAGTCAAGGTACCGATGAATGTTGCATTTTACGATGCCTTATATCA 
CCACAACAAAGAATCAAAGAGGCCACTGTATTTGTTGCARGGAATACGTATAGATTCTTATCGCAATAATGCTTCTAT 
AACAGCTTTTAATGATAATTATAGGGGGTATTTAAAACGAGAAGCAAAAGGCGTTGTGGATATTCTCCATGGGCGTAA 
GCAAGTATGGAATACTGATTTTGGTAGCCGTCATTATCATTATGATCTTAGTCCTTGGGTACTTGGTTATGTCGTAGG 
GGATGATTGCAATAGTGGTACTGTCGCTTATACTAATCATCAAGAGAAAAAAACGCAATATAAAGGACGTTATTTTAA 
AACTTCTGTGGCAGCTAATCCATTTGAGGTCATGCTAGCTCAAGTAATGGATGAATTGACACATTATGAGACAGCTAA 
ATATGGTTGGC^^CATTTGATTAGTTTTTCAAACTCACCAACAACAGACCCTTTTCATTATCGAAAACCATTTGAGGC 

ACAGGCTCCTAAATACGTACAACTAAATGTAGAAAA 

ATATAAAGCTATTGATTTCCATCCTCGATACAAGGATTATCTATTATTTGATAAAGAGAATATCAGTAAAGAAGATAG 
ACAAAAGATTAAAGAACTTTCTTTGTCAC^GGGATACGTTAAACTGCTAAATGCTTATCACAAAATCCCTGTTCTAGT 
CACGGGTTATGGCTATTCGACAGCGAGAGGTATTGCCCAAAAAGAAATTGATAAACGTCCTCTGCCGATTAATGA^ 

agaacSggtcagcgtttactagaagattatgaatctt^^ 

gcaagacgattggaatgcaagggcgtggaatacatctttcgccacaaataaacatagtcaattcctatggggggatgc 
acaagt atttaatcaaggttatggt ttatt aggcttt aaaaacgcaaaacatcattatc aagt tgatggtaaaag agg 
caaaggagagtggaaacatcctctgatgactagtgcaacaggagatgacttatatgct 
ctaccttgcgattaaaacaaaacctgaaaaactaaaagaaaaacgattattaccaatagata^ 

SctaS^tgaa^g^^ 

TGCTTTCC(^CC^GAAGAACAGTAGTAATTTTGAGCAGATAAATATGGTATTGAGAA^ 

CATGGAAAAAGTAAAAGCAACAGAGAGGTTCTTACCAACTC^TCCTACTGGTCTTCTCAAAACAGGAA 

GCACCAAAAAACATTTGATTCACAAACAGATATTTCGTTTGGAAAGGACTTTATAGAGGTCAGA 

gSgaa^SScScatctcaa^ 

TGAGAG^TTGCTTTAGGATTAGGTGCTAATAGCAAAGAAAACACACTGATAAAGATGGCAGATTATCGTTTGAAAAA 
TTGGGAGAGACCCGATACCAAAACCTTTTTAAAAGACTCCTATTATAGTATTAAG 

SEO ID HO 2811- SAG1552 FROM THE M781 GBS TYPE III STRAIN 

TTTGATGGTAGTTTGTATTTACCACAGGGCTTATTAAAAGAAAATAC^GAACTAACTTTGTTGTTAAAGGTGATACT 

gtIcttScaagS 

gtcaaggtaccgatgaatcttgcattt 
??g^ggaatacgtmagat?cttSc^ 
aaac^gaa^caaaaggcgttgtg!^^ 
tIScaSgatI^^ 

ctagct^gtaatggatgaattgacacattatgagacagctaaatatggttggcaacatttgattagtttttcaarc 
tc^ccaacaacagacccttttcattatcgaaaacc^tttgaggcac^ggctcctaaatacgtacaactaaatgtagaa 

GATTATCTATTATTTGATAAAGA^ 

GCCCAAAAAGAAATTGATAAACGTCCTCTGCCGATTAATGAAAAAGAACAAGGTCAGCGTTTACTAGAA 

TTTAAAAACGCAAAACATCATTATCAAGTTGATGGTAAAAGAGGCAAAGGAGAGTGGAAACATCCTCTGATGACTAGT 
GCAACAGGAGATGACTTATATGCTAGCAGTGATGAAfifiCTATCTCTACCTTGCGATTAAAACAAAACCTGAAAAACTA 

a^aaaaacgIS 

TCTAAATCTAGTGACTTTGTATTGTCTATTGATCCAAATGGCAAGTCTGAATTATTTGTCCAAGAGCGCTATAATGCC 
TTAAAAGCGAACTATCTTCGACAGCTTAACGGTAAAGATTTTTATGCTTTCCCACCAAAGAAGAACAGTAGTAATTTT 
GAGCAGATAAATATGGTATTGAGAAATACAAAGATTGTTGAAGACATGGAAAAAGTAAAAGCAACAGAGAGGTTCTTA 

TC^TTTGGAAAGGACTTTATAGAGGTCAGAATTCCGTGGCAGTTGTTGAATTTTTCTGATCCATCATCTCAAAAAATT 

cacgSgat^ctttaaaSttat^ 

AAAGAARACACACTGATAAAGATGGCAGATTATCGTTTGAAAAATTGGGAGAGACCCGATACCAAAACCTTTTTAAAA 
GACTCCTATTATAGTATTAAGAAAGAATGG 
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Table 28: Comparative Sequences relating to SAG155. 
(conserved hyp thetical protein) 
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AAGGGCTTATTAAAAGAAAATACAAGAACT 

~~~ TATTAAAAGAAAATACAAGAACT 

AAGGGCTTAT T AAAAGAAAAT AC AAGAACT 

ATTACTTTGATGGTAGTTTGTATTTACCAAAGGGCTTATTAAAAGAAAATACAAGAACT 

TTTACCACAGGGCTTATTAAAAGAAAATACAAGAACT 

AAGGGGCTTATTAAAAGAAAATACAAGAACT 



TACAAGAACT 

TTTGATGGTAGTTTGTATTTACCACAGGGCTTATTAAAAGAAAATACAAGAACT 

--TTTGTTGTTAAAGGTGATACTGTACTTCACAAGCCCACCAATAAACCTTTTGTTGTT 

ACTTOGTTGTTAAAGGTGATACTGTACTTCACAAGCCCACCAATAAACCTTTTGTTGTT 
ACTTTGTTGTTAAAGGTGATACTGTACTTCACAAGCCCACCAATAAACCTTTTGTTGTT 
ACTTTGTTGTTAAAGGTGATACTGTACTTCACAAGCCCACCAATAAACCTTTTGTTGTT 
ACTTTGTTGTTAAAGGTGATACTGTACTTCACAAGCCCACCAATAAACCTTTTGTTGTT 
ACTTTGTTGTTAAAGGTGATACTGTACTTCACAAGCCCACCAATAAACCTTTTGTTGTT 
ACTTTGTTGTTAAAGGTGATACTGTACTTCACAAGCCCACCAATAAACCTTTTGTTGTT 
ACTTTGTTGTTAAAGGTGATACTGTACTTCACAAGCCCACCAATAAACCTTTTGTTGTT 
ACTTTGTTGTTAAAGGTGATACTGTACTTCACAAGCCCACCAATAAACCTTTTGTTGTT 
ACTTTGTTGTTAAAGGTGATACTGTACTTCACAAGCCCACCAATAAACCTTTTGTTGTT 

AAGGAGTAGACGTTGAGTCTTCCTTAGCAGGTTATCATCACAACGATTTTCCTATTACT 

^G^GTAGAC^TTGAGTCTTCCTTAGCGGGTTATCATCACAACGATTTTCCTATTACT 
AAGGAGTAGACGTTGAGTCITCCTTAGCGGGTTATCA'rCACAACGATTTTCCTATTACT 
AAGGAGTAGACGTTGAGTCTTCCTTAGCGGGTTATCATCACAACGATTTTCCTATTACT 
AAGGAGTAGACGTTGAGTCTTCCTTAGCGGGTTATCATCACAACGATTTTCCTATTACT 
AAGGAGTAGACGTTGACTCTTCCTTAGCGGGTTATCATCACAACGATTTTCCTATTACT 
AAGGAGTAGACGTTGAGTCTTCCTTAGCGGGTTATCATCACAACGATTTTCCTATTACT 
AAGGAGTAGACGTTGAGTCTTCCTTAGCGGGTTATCATCACAACGATTTTCCTATTACT 
AAGGAGTAGACGTTGAGTCTTCCTTAGCGGGTTATCATCACAACGATTTTCCTATTACT 
AAGGAGTAGACGTTGAGTCTTCCTTAGCGGGTTATCATCAC^ACGATTTTCCTATTACT 

AAAAAACGTATCGTGAGTGGTTCCATTTAATTTCCAACATGGGGGCAAATACTGTAAGA 

AAAAAACGTAT CGTGAATGGTTCCATTTAATTTCCAACATGGGGG CAAAT ACT GT AAGA 
AAAAAACGTATCGTGAATGGTTCCATTTAATTTCCAACATGGGGGCAAATACTGTAAGA 
' AAAAAACGTATCGTGAATGGTTCCATTTAATTTCCAACATGGGGGCAAATACTGTAAGA 
AAAAAACGTATCGTGAATGGTTCCATTTAATTTCCAACATGGGGGCAAATACTGTAAGA 
AAAAAACGTATCGTGAATGGTTCCATTTAATTTCCAACATGGGGGCAAATACTGTAAGA 
AAAAAACGTATCGTGAATGGTTCCATTTAATTTCCAACATGGGGGCAAATACTGTAAGA 
AAAAAACGTATCGTGAATGGTTCCATTTAATTTCCAACATGGGGGCAAATACTGTAAGA 
AAAAAACGTATCGTGAATGGTTCCATTTAATTTCCAACATGGGGGCAAATACTGTAAGA 
AAAAAACGTATCGTGAATGGTTCCATTTAATTTCCAACATGGGGGCAAATACTGTAAGA 

TCAAAGTACCGATGAATGTTGCATTTTACGATGCTTTATATCACCACAACAAAGCATCA 

TCAAGGTACCGATGAATGTTGCATTTTACX5ATGCCTTATATCACCACAACAAAGCATCA 
TCAAGGTACCGATGAATGTTGCATTTTACGATGCCTTATATCACCACAACAAAGCATCA 
TCAAGGTACCGATGAATGTTGCATTTTACGATGCCTTATATCACCACAACAAAGCATCA 
TCAAGGTACCGATGAATGTTGC^TTTOACGATGCCTTATATC^CCACAAaVAAGCATCA 
TCAAGGTACCGATGAATGTTGCATTTTACGATGCCTTATATCACCACAACAAAGAATCA 
TCAAGGTACCGATGAATGTTGCATTTTACGATGCCTTATATCACCACAACAAAGCATCA 
TCAAGGTACCGATGAATGTTGCATTTTACGATGCCTTATATCACCACAACAAAGCATCA 
TCAAGGTACCGATGAATGTTGCATTTTACGATGCCTTATATCACCACAACAAAGAATCA 
TCAAGGTACCC5ATGAATGTTGCATTTTACGATGCCTTATATCACCACAACAAAGAATCA 

AGAGGCCACTGTATTTGTTGCAAGGAATACGTATAGATTCTTATCGCAATAATGCTTCT 

AGAGGCCACTGTATTTGTTGCAAGGAATACGTATAGATTCTTATCGCAATAATGCTTCT 
AGAGGCCACTGTATTTGTTGCAAGGAATACGTATAGATTCTTATCGCAATAATGCTTCT 
AGAGGCCACTGTATTTGTTGCAAGGAATACGTATAGATTCTTATCGCAATAATGCTTCT 
AGAGGCCACTGTATTTGTTGCAAGGAATACGTATAGATTCTTATCGCAATAATGCTTCT 
AGAGGCCACTGTATTTGTTGCAAGGAATACGTATAGATTCTTATCGCAATAATGCTTCT 
AGAGGCCACTGTATTTGTTGCAAGGAATACGTATAGATTCTTATCGCAATAATGCTTCT 
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Tabl 28: Comparative Sequences relating to SAG155 
(conserved hypothetical pr tein) 

AGAGGCCACTGT ATTTGT T GCAAGGAATACGT AT AG ATT CTT ATCGCAAT AAT GCTT CT 
AGA^CCACTGTATTTGTTGCAAGGAATACGTATAGATTCTTATCXSCAATAATGCTTCT 
AGA^CCACTGTATTTGTTGCAAGGAATACGTATAGATTCTTATCGCAATAATGCTTCT 

T AACAGCTTT T AATG ATAATT ATAGGGGGT AT T TAAAACGAGAAGCAAAAGGCGTT GTG 

TAACAGCTTTTAATGATAATTATAGGGGGTATTTAAAACGAGAAGCAAAAGGCGTTGTG 
TAACAGCTTTT AAT GAT AATT ATAGGGGGT ATTTAAAACGAGAAGCAAAAGGCGTTGTG 
TAACAGCTTTTAATGATAATTATAGGGGGTATTTAAAACGAGAAGCAAAAGGCGTTGTG 
TAACAGCTTTTAATGATAATTATAGGGGGTATTTAAAACGAGAAGCAAAAGGCGTTGTG 
TAACAGCTTTTAATGATAATTATAGGGGGTATTTAAAACGAGAAGCAAAAGGCGTTGTG 
TAACAGCTTTTAATGATAATTATAGGGGGTATTTAAAACGAGAAGCAAAAGGCGTTGTG 
TAACAGCTTTTAATGATAATTATAGGGGGTATTTAAAACGAGAAGCAAAAGGCGTTGTG 
TAACAGCTTTTAATGATAATTATAGGGGGTATTTAAAACGAGAAGCAAAAGGCGTTGTG 
TAACAGCTTTTAATGATAATTATAGGGGGTATTTAAAACGAGAAGCAAAAGGCGTTGTG 

ATATTCTCC^TGGGCGTAAGCAAGTATGGAATACTGATTTTGGTAGCCGTCATTATCAT 

„„ ATGACTA-GTGCAACAGGAGATGACTTATAT— GCTAGCAGTGATGAAAGC 

ATATTCTCCATGGGCGTAAGCAAGTATGGAATACTGATTTGGGTAGCCGTCATTATCAT 

ATATTCTCCATGGGCGTAAGCAAGTATGGAATACTGATTTGGGTAGCCGTCATTATCAT 

ATATTCTCCATGGGCGTAAGCAAGTATGGAATACTGATTTGGGTAGCCGTCATTATCAT 

ATATTCTCCATGGGCGTAAGCAAGTATGGAATACAGATTTTGGTAGCCGTCATTATGAT 

ATATTCTCCATGGGCGTAAGCAAGTATGGAATACTGATTTTGGTAGCCGTCATTATCAT 

ATATTCTCCATGGGCGTAAGCAAGTATGGAATACTGATTTTGGTAGCAGTCATTATCAT 

ATATTCTCCATGGGCGTAAGCAAGTATGGAATACTGATTTTGGTAGC^G 

ATATTCTCCATGGGCGTAAGGAAGTATGGAATACTGATTTTGGTAGCCCTCAOTATCM 

AT ATT CTCCATGGGCGT AAGCAAGTATGGAATACTGATTTTG GTAGCCGT CATT ATCAT 

TATGATCTTAGTCCTTGGGTACTTGGTTATGTCGTAGGGGATGATTGGAATAGTGGT--AC 
TAT--CTCTA — CCTTGCG-ATTAAAACAAAACCTGAAAAACTAAAAGAAAAACGATTAT 
TATGATCTTAGTCCTTGGGTACTTGGTTATGTCGTAGGGGATGATTGGAATAGTGGT-AC 
TATGATCTTAGTCCTTGGGTACTTGGTTATGTCGTAGGGGATGATTGGAATAGTGGT-AC 
TATGATCTTAGTCCTTGGGTACTTGGTTATGTCGTAGGGGATGATTGGAATAGTGGT-AC 
TATGATCTTAGTCCTTGGGTACTTGGTTATGTCGTAGGGGATGATTGGAATAGTGGT-AC 
TATGATCTTAGTCCTTGGGTACTTGGTTATGTCGTAGGGGATGATTGGAATAGTGGT-AC 
TATGATCTTAGTCCTTGGGTACTTGGTTATGTCGTAGGGGATGATGGACATAGTGGT-AC 
TATGATCTTAGTCCTTGGGTACTTGGTTATGTCGTAGGGGATGATTGGAATAGTGGT-AC 
TATGATCTTAGTCCTTGGGTACTTGGTTATGTCGTAGGGGATGATTGCAATAGTGGT— AC 

taScttagt^^^ 

TGTCGCTT-ATACTAATCATCAAGAGAA-AAAAACGCAATATAAAGGAC-GTTATTTTAA 

SccStLatatta--caccaaaatct 

TGTCGCTT-ATACTAATCATCAAGAGAA-AAAAACGCAATATAAAGGAC-GTTATTTTAA 
TGTC^T-ATACTAATCATCAAGAG^ 

TGTC^T-ATACTAATCATCAAGAGAA-AAAAACGCAATATAA^ 

TGTCGCTT-ATACTAATCATC^GAGAA-AAAAACGCAATATAAAGGAC^ 

TGTCGCTT - ATACTAATC ATCAAGAGAA- AAAAACGCAATATAAAGGAC- ^^^^3^^ 

TCTCGCTTTATACTAATCATCAAGAGGAGAAAAACGCAATATAAAG^ 

TGTCGCTT-ATACTAATCATCAAGAGAA-AAAAACGCAATATAAAGGAC~GTO 

TGTCG CTT— AT ACTAATCATCAAGAGAA- AAAAACGCAATATAAAGGAC- GTTATTTT AA 

TGTCGCTT-ATACTAATCATCAAGAGAA--AAAAACGCAATATAAAGGAC--GTTATTTTAA 

AACTT CT GCGG C^GCT AATCC ATTTGAGGTCATGCT AGCTCAAGTT ATGGM —GAATTG 
ATTTTCTAAATCTAGTGA-CTTTGTATTGTC-TATTGATCCAAATGGCAAGTCTGAA^- 




AACTTCTGTGGCAGCT AATCCATTTGAGGTCATGCr A^u i o W * — -~ 

AACTTCTGTGGCAGCTAATCCATTTGAGGTCATGCTAGCTCAAGT2\ATGGAT ^^^^ 

AACTTCTGTGGCAGCTAATCCATTTGAGGTCATGCTAGCTCAAGTAATGGAT GAATTG 

AACTTCTGTGGCAGCTAATCCATTTGAGGTCATGCTAGCTCAAGTAATGGAT GAATTG 

AACTTCTGTGGCAGCTAATCCATTTGAGGTCATGCTAGCTCAAGTAATGGAT GAATTG 

AACTTCTGTGGCAGCTAATCCATTTGAGGTCATGCTAGCTCAAGTAATGGAT GAATTG 

AACTTCTGTGGCAGCTAATCCATTTGAGGTCATGCTAGCTCAAGTAATGGAT--GAATTG 

ACAC^TTATGAGACAGCTAAATATGGTTGGCAACATTTGATTAGTTTTTCAAACTCACCA 

A^TTATG^ 
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SEQ2805 
SEQ2806 
SEQ2807 
SEQ2808 
SEQ2809 
SEQ2810 
SEQ2811 

SEQ2801 
SEQ2802 
SEQ2B03 
SEQ2804 
SEQ2805 
SEQ2806 
SEQ2807 
SEQ2808 
SEQ2809 
SEQ2810 
SEQ2811 

SEQ2801 
SEQ2802 
SEQ2803 
SEQ2804 
SEQ2805 
SEQ2806 
SEQ2807 
SEQ2B08 
SBQ2809 
SEQ2810 
SEQ2811 

SEQ2801 
SEQ2802 
SEQ2803 
SEQ2804 
SEQ2805 
SEQ2806 
SEQ2807 
SEQ2808 
SEQ2809 
SEQ2810 
SEQ2811 

SEQ2801 
SEQ2802 
8EQ2803 
SEQ2804 
SEQ2805 
SEQ2806 
SEQ2807 
SEQ2808 
SEQ2809 
SEQ28X0 
SEQ2811 

SEQ2801 
SEQ2802 
SEQ2803 
SEQ2804 
SEQ2805 
SEQ2806 
SEQ2807 
SEQ2808 
SEQ2809 
SEQ2810 
SEQ2811 



& a H & >^K, oae: 

Table 28: Comparative Sequenc s relating t SAG155^^ 
(conserved hypothetical pr tein) 

ACACATTATGAGACAGCTAAATATGGTTGGCAACATTTGATTAGTTTTTCAAACTCACCA 
ACACATTATGAGACAGCTAAATATGGTTGGCAACATTTGATTAGTTTTTCAAACTCACCA 
ACACATTATGAGACAGCTAAATATGGTTGGCAACATTTGATTAGTTTTTCAAACTCACCA 
ACACATTATGAGACAGCTAAATATGGTTGGCAACATTTGATTAGTTTTTCAAACTCACCA 
ACACATTATGAGACAGCTAAATATGGTTGGCAACATTTGATTAGTTTTTCAAACTCACCA 
ACACATTATGAGACAGCTAAATATGGTTGGCAACATTTGATTAGTTTTTCAAACTCACCA 
ACACATTATGAGACAGCTAAATATGGTTGGCAACATTTGATTAGTTTTTCAAACTCACCA 

CAACAGAC CCTTTTCGTTATCGAAAACCATTTG-AGGCACAGGCTCCTAAATA 

TAAAGATTTTTATGCTTTCCCACCAAAGAAGAACAGTAGTAATTTTGAGCAGATCAATA 

CAACAGAC CCTTTTCATTATCGAAAACCATTTG-AGGCACAGGCTCCTAAATA 

CAACAGAC CCTTTTCATTATCGAAAACCATTTG^AGGCACAGGCTCCTAAATA 

CAACAGAC CCTTTTCATTATCGAAAACCATTTG-AGGCACAGGCTCCTAAATA 

CAACAGAC CCTTTTCATTATCGAAAACCATTTG-AGGCACAGGCTCCTAAATA 

CAACAGAC CCTTTTCATTATCGAAAACCATTTG-AGGCACAGGCTCCTAAATA 

CAACAGAC CCTTTTCATTATCGAAAACCATTTG-AGGCACAGGCTCCTAAATA 

CAACAGAC CCTTTTCATTATCGAAAACCATTTG-AGGCACAGGCTCCTAAATA 

CAACAGAC CCTTTTCATTATCGAAAACCATTTG-AGGCACAGGCTCCTAAATA 

CAACAGAC CCTTTTCATTATCGAAAACCATTTG-AGGCACAGGCTCCTAAATA 

G-TACAACTAAATGTAGAAAATATTCAAGCTAATTCGAATGTTAAAGCA GGTATTT 

GGTATTGAGAAATACAAAGATTGTTGAAGACATGGAAAAAGTAAAAGCAACAGAGAGGT 

G-TACAACTAAATGTAGAAAATATTCAAGCTAATTCAAATGTTAAAGCA GGTATGT 

G-TACAACTAAATGTAGAAAATATTCAAGCTAATTCAAATGTTAAAGCA GGTATGT 

G-TACAACTAAATGTAGAAAATATTCAAGCTAATTCAAATGTTAAAGCA GGTATGT 

G-TACAACTAAATGTAGAAAATATTCAAGCTAATTCAAATGTTAAAGCA GGTATGT 

G-TACAACT AAATGT AG AAAATATT CAAGCTAATT CAAATGTTAAAGCA GGTATGT 

G-TACAACTAAATGTAGAAAATATTCAAGCTAATTCGAATGTTAAAGCA GGTATGT 

G-TACAACTAAATGTAGAAAATATTCAAGCTAATTCGAATGTTAAAGCA GGTATGT 

G-TACAACTAAATGTAGAAAATATTCAAGCTAATTCAAATGTTAAAGCA GGTATGT 

G-TACAACTAAATGTAGAAAATATTCAAGCTAATTCAAATGTTAAAGCA GGTATGT 

TTGCAGCATATAAAGCTATTGATTTCCATCCTCGATACAAGGATTATCTATTATTTGATA 

TCTTACCAACTCATCCTACTGGTCTT CTCAAAACAGGAACAATTGAT-AGGCACCA 

TTGCAGCATATAAAGCTATTGATTTCCATCCTCGATACAAGGATTATCTATTATTTGATA 
TTGCAGCATATAAAGCTATTGATTTCCATCCTCGATACAAGGATTATCTATTATTTGATA 
TTGCAGCATATAAAGCTATTGATTTCCATCCTCGATACAAGGATTATCTATTATTTGATA 
TTGCAGCATATAAAGCTATTGATTTCCATCCTCGATACAAGGATTATCTATTATTTGATA 
TTGCAGCATATAAAGCTATTGATTTCCATCCTCGATACAAGGATTATCTATTATTTGATA 
TTGCAGCATATAAAGCTATTGATTTCCATCCTCGATACAAGGATTATCTATTATTTGATA 
TTGCAG<^TATAJ\AGCTATTGATTTCCATCCTCGATACAAGGATTATCTATTATTTGATA 
TTGCAGCATATAAAGCTATTGATTTCCATCCTCGATACAAGGATTATCTATTATTTGATA 
TTGCAGCATATAAAGCTATTGATTTCCATCCTCGATACAAGGATTATCTATTATTTGATA 

AAGAGAATATCAGTAAAGAAGATAGACAAAAGATT-AAAGAACTTTCTTTGTCACAGGGA 
AAAAACATTTGATTCACAAACAGATATTTCGTTTGGAAAGGACTTTATAGAGGTCAGAAT 
AAGAGAATATCAGTAAAGAAGATAGACAAAAGATT-AAAGAACTTTCTTTGTCACAGGGA 
AAGAGAATATCAGTAAAGAAGATAGACAAAAGATT-AAAGAACTTTCTTTGTCACAGGGA 
AAGAGAATATCAGTAAAGAAGATAGACAAAAGATT— AAAGAACTTTCTTTGTCACAGGGA 
AAGAGAAT ATCAGT AAAGAAGATAGAC AAAAG ATT - AAAGAACTTTCTTTGT CACAGGGA 
AAGAGAATATCAGTAAAGAAGATAGACAAAAGATT- AAAGAACTTTCTTTGTCACAGGGA 
AAGAGAATATCAGTAAAGAAGATAGACAAAAGATT-AAAGAACTTTCTTTGTCACAGGGA 
AAGAGAATATCAGTAAAGAAGATAGACAAAAGATT-AAAGAACTTTCTTTGTCACAGGGA 
AAGAGAATATCAGTAAAGAAGATAGACAAAAGATT- AAAGAACTTTCTTTGTCACAGGGA 
AAGAGAAT ATCAGT AAAGAAG AT AGACAAAAGATT -AAAGAACTTTCTTTGT CACAGGG A 

TACGTTA— AACTGCTAAATGCTTATCACAAAATCCCTGTTCTAGTCACGGGTTATGGCTA 

TCCGTGGCAGTTGTTGAATTTTTCTGATCCA TCATCTCAAAAAATTCACGATGATTA 

TACGTTA-AACTGCTAAATGCTTATCACAAAATCCCTGTTCTAGTCACGGGTTATGGCTA 
TACGTTA-AACTGCTAAATGCTTATCACAAAATCCCTGTTCTAGTCACGGGTTATGGCTA 

TACX3TTA-AACTGCTAAATGCITATGACAAAATCCCT 

TACGTTA-AACTGCTAAATGCTTATCACAAAATCCCTGTTCTAGTCACGGGTTATGGCTA 
TACGTTA-AACTGCTAAATGCTTATCACAAAATCCCTGTTCTAGTCACGGGTTATGGCTA 
TACGTTA-AACTGCTAAATGCTTATCACAAAATCCCTGTTCTAGTCACGGGTTATGGCTA 
TACGTTA-AACTGCTAAATGCTTATCACAAAATCCCTGTTCTAGTCACGGGTTATGGCTA 
TACGTTA-AACTGCTAAATGCTTATCACAAAATCCCTGTTCTAGTCACGGGTTATGGCTA 
TACGTTA-AACTGCTAAATGCTTATCACAAAATCCCTGTTCTAGTCACGGGTTATGGCTA 



J3EQ2801 
SEQ2802 
SEQ2803 
SEQ2804 
SEQ2805 
SEQ2806 
SEQ2807 
SEQ2808 
SEQ2809 
SEQ2810 
SEQ2811 

SEQ2801 
SEQ2802 
SEQ2803 
SEQ2804 
SEQ2805 
SEQ2806 
SEQ2807 
SEQ2808 
SEQ2809 
8EQ2810 
SEQ2811 

SEQ2801 
SEQ2802 
SEQ2803 
SEQ2804 
SEQ2805 
SEQ2806 
SEQ2807 
SEQ2808 
SEQ2809 
SEQ2810 
SEQ2811 

SEQ2801 
SEQ2802 
SEQ2803 
SEQ2804 
SEQ2805 
SEQ2806 
SEQ2807 
SEQ2808 
SEQ2809 
SEQ2810 
SEQ2811 

SEQ280X 
SEQ2802' 
SEQ2803 
SEQ2804 
SEQ2805 
SEQ2806 
SEQ2807 
SEQ2808 
SEQ2809 
SEQ2810 
SEQ2811 

SEQ2801 
SEQ2802 
SEQ2803 
SEQ2804 
SEQ2805 
SEQ2806 
SEQ2807 
SEQ2808 



Table 28: C mparative Sequences relating to SAG15S 
(conserved hypothetical pr tein) 



TCGACAGCGAGAGGTATTGCCCAAAAAGAAATTGATAAACGTCCTCTGCCGATTAATG^ 
TTTAAACATTATGGTGTGAAGGAGTTAGAAATTGA-GAGCATTGCTTTAGGATTAGGTG 
TCGACAGCGAGAGGTATTGCCCAAAAAGAAATTGATAAACGTCCTCTGCCGATTAATGA 
TCGACAGCGAGAGGTATTGCCCAAAAAGAAATTGATAAACGTCCTCTGCCGATTAATGA 
TCGACAGCGAGAGGTATTGCCCAAAAAGAAATTGATAAACGTCCTCTGCCGATTAATGA 
TCGACAGCGAGAGGTATTGCCCAAAAAGAAATTGATAAACGTCCTCTGCCGATTAATGA 
TCGACAGCGAGAGGTATTGCCCAAAAAGAAATTGATAAACGTCCTCTGCCGATTAATGA 
TCGACAGCGAGAGGTATTGCCCAAAAAGAAATTGATAAACGTCCTCTGCCGATTAATGA 
TCGACAGCGAGAGGTATTGCCCAAAAAGAAATTGATAAACGTCCTCTGCCGATTAATGA 
TCGACAGCGAGAGGTATTGCCCAAAAAGAAATTGATAAACGTCCTCTGCCGATTAATGA 
TCGACAGCGAGAGGTATTGCCCAAAAAGAAATTGATAAACGTCCTCTGCCGATTAATGA 

AAAGAACAAGGTCAGCGTTTACTAGAAGATTATGAATCTTTTATATCATCCGGTAGTTT 

TAATAGCAAAGAAAACACACTGATAAAGATGGCAGAT TATCGTTTGAAAAATT 

AAAGAACAAGGTCAGCGTTTACTAGAAGATTATGAATCTTTTATATCATCCGGTAGTTT 
AAAGAACAAGGTCAGCGTTTACTAGAAGATTATGAATCTTTTATATCATCCGGTAGTTT 
AAAGAACAAGGTCAGCGTTTACTAGAAGATTATGAATQTTTTATATCATCCGGTAGTTT 
AAAGAACAAGGTCAGCGTTTACTAGAAGATTATGAATCTTTTATATCATCCGGTAGTTT 
AAAGAACAAGGTCAGCGTTTACTAGAAGATTATGAATCTTTTATATCATCCGGTAGTTT 
AAAGAACAAGGTCAGCGTTTACTAGAAGATTATGAATCTTTTATATCATCCGGTAGTTT 
AAAGAACAAGGTCAGCGTTTACTAGAAGATTATGAATCTTTTATATCATCCGGTAGTTT 
AAAGAACAAGGTCAGCGTTTACTAGAAGATTATGAATCTTTTATATCATCCGGTAGTTT 
AAAGAACAAGGTCAGCGTTTACTAGAAGATTATGAATCTTTTATATCATCCGGTAGTTT 

GGAGCGACTATCAATGCATGGCAAGACGATTGGAATGCAAGGGCGTGGAATACATCCTT 

GGAGAGAC — CCGATAC CAAAACCTTTTTAA AAGACTCCTATTATAGTATT 

GGAGCGACTATCAATGCATGGCAAGACGATTGGAATGCAAGGGCGTGGAATACATCTTT 
GGAGCGACTATCAATGCATGGCAAGACGATTGGAATGCAAGGGCGTGGAATACATCTTT 
GGAGCGACTATCAATGCATGGCAAGACGATTGGAATGCAAGGGCGTGGAATACATCTTT 
GGAGCGACTATCAATGCATGGCAAGACGATTGGAATGCAAGGGCGTGGAATACATCTTT 
GGAGCGACTATCAATGCATGGCT^AGACGATTGGAATGCAAGGGCGTGGAATACATCTTT 
GGAGCGACTATCAATGCATGGCAAGACGATTGGAATGCAAGGGTGTGGAATACATCCTT 
GGAGCGACTATCAATGCATGGCAAGACGATTGGAATGCAAGGGTGTGGAATACATCCTT 
GGAGCGACTATCAATGCATGGCAAGACGATTGGAATGCAAGGGCGTGGAATACATCTTT 
GGAGCGACTATCAATGCATGGCAAGACGATTGGAATGCAAGGGCGTGGAATACATCTTT 

GCCACAAATAAACATAGTCAATTCCTATGGGGGGATGCACAAGTATTTAATCAAGGTTA 

A—— AGAAAGAA *" — — — 

GCCACAAATAAACATAGTCAATTCCTATGGGGGGATGCACAAGTATTTAATCAAGGTTA 
GCCACAAATAAACATAGTCAATTCCTATGGGGGGATGCACAAGTATTTAATCAAGGTTA 
GCCACAAATAAACATAGTCAATTCCTATGGGGGGATGCACAAGTATTTAATCAAGGTTA 
GCCACAAATAAACATAATCT^ATTCCTATGGGGGGATGCACAAGTATTTAATCAAGGTTA 
GCCACAAATAAACATAGTCAATTCCTATGGGGGGATGCACAAGTATTTAATCAAGGTTA 
GCCACAAATAAACATAGTCAATTCCTATGGGGGGATGCACAAGTATTTAATCAAGGTTA 
GCCACAAATAAACATAGTCAATTCCTATGGGGGGATGCACAAGTATTTAATCAAGGTTA 
GCCACAAATAAACATAGTCAATTCCTATGGGGGGATGCACAAGTATTTAATCAAGGTTA 
GCCACAAATAAACATAGTCAATTCCTATGGGGGGATGCACAAGTATTTAATCAAGGTTA 

GGTTTATTAGGCTTTAAAAACGCAAAACATCATTATCAAGTTGATGGTAAAAGAGGCAA 

GGTTTATTAGGCTTTAAA71ACGCAAAACATCATTATCAAGTTGATGGTAAAAGAGGCAA 
GGTTTATTAGGCTTTAAAAACGCAAAACATCATTATCAAGTTGATGGTAAAAGAGGCAA 
GGtTTATTAGGCTTTAAAAACGOW^ACATCATTATCAAGTTGATGGTAAAAGAGGCAA 
GGTTTATTAGGCTTTAAAAACGCAAAACATCATTATCAAGTTGATGGTAAAAGAGGCAA 
GGTTTATTAGGCTTTAAAAACGCAAAACATCATTATCAAGTTGATGGTAAAAGAGGCAA 
GGTTTATTAGGCTTTAAAAACGCAAAACATCATTATCAGGTTGATGGTAAAAGAGGCAA 
GGTTTATTAGGCTTTAAAAACGCAAAACATCATTATCAGGTTGATGGTAAAAGAGGCAA 
GGTTTATTAGGCTTTAAAAACGCAAAACATCATTATCAAGTTGATGGTAAAAGAGGCAA 
GGTTTATTAGGCTTTAAAAACGCAAAACATCATTATCAAGTTGATGGTAAAAGAGGCAA 



GGAGAGTGGAAACATCCTCTG- 



GGAGAGTGGAAACATCCTCTGATGACTAGTGCAACAGGAGATGACTTATATGCTAGCAG 
GGAGAGTGGAAACATCCTCTGATGACTAGTGCAACAGGAGATGACTTATATGCTAGCAG 
GGAGAGTGGAAACATCCTCTGATGACTAGTGCAACAGGAGATGACTTATATGCTAGCAG 
GGAGAGTGGAAACATCCTCTGATGACTAGTGCAACAGGAGATGACTTATATGCTAGCAG 
GGAGAGTGGAAACATCCTCTGATGACTAGTGCAACAGGAGATGACTTATATGCTAGCAG 
GAAGAGTGGAAACATCCTCTGATGACTAGTGCAACAGGAGATGACTTATATGCTAGCAG 
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Table 28: Comparative Sequences relating to SAG155! 
(conserv d hypothetical protein) 




SEQ2809 
SEQ2810 
SEQ28X1 

SEQ2801 
SEQ2602 
SEQ2803 
SEQ2804 
SEQ2805 
SEQ2806 
SEQ2607 
SEQ2808 
SEQ2809 
SEQ2810 
SEQ2811 

SEQ2801 
SEQ2802 
SEQ2803 
SEQ2804 
SEQ2805 
SEQ2806 
SEQ2807 
SEQ2808 
SEQ2809 
SEQ2810 
SEQ2811 

SEQ2801 
SEQ2802 
SEQ2803 
SEQ2804 
SEQ2805 
SEQ2806 
SEQ2807 
* SEQ2606 
SEQ2809 
3EQ2810 
SEQ2811 

SEQ2801 
SEQ2802 
SEQ2803 
SEQ2804 
SEQ2805 
. SEQ2806 
SEQ2807 
SEQ2808 
SEQ2809 
SEQ2810 
SEQ2811 

SEQ2801 
SEQ2802 
SEQ2803 
SEQ2804 
SEQ2805 
SEQ2806 
SEQ2807 
SEQ2808 
SEQ2809 
SEQ2810 
SEQ2811 

SEQ2801 
SEQ2802 
SEQ2803 
3EQ2804 



GAAGAGTGGAAACATCCTCTGATGACTAGTGCAACAGGAGATGACTTATATGCTAGCAG 
GGAGAGTGGAAACATCCTCTGATGACTAGTGCAACAGGAGATGACTTATATGCTAGCAG 
GGAGAGTGGAAACATCCTCTGATGACTAGTGCAACAGGAGATGACTTATATGCTAGCAG 



GATGAAAGCTATCTCTACCTTGCGATTAAT^ACAAAACCTGAAAAACTAAAAGAAAAACG 
GATGAAAGCTATCTCTACCTTGCGATTAAAACAAAACCTGAAAAACTAAAAGAAAAACG 
GATGAAAGCTATCTCTACCTTGCGATTAAAACAAAACCTGAAAAACTAAAAGAAAAACG 
GATGAAAGCTATCTCTACCTTGCGATTAAAACAAAACCTGAAAAACTAAAAGAAAAACG 
GATGAAAGCTATCTCTACCTTGCGATTAAAACAAAACCTGAAAAACTAAAAGAAAAACG 
GATGAAAGCTATCTCTACCTTGCGATTAAAACAAAACCTGAAAAACTAAAAGAAAAACG 
GATGAAAGCTATCTCTACCTTGCGATTAAAACAAAACCTGAAAAACTl^AAAGAAAAACG 
GATGAAAGCTATCTCTACCTTGCGATTAAAACAAAACCTGAAAAACTAAAAGAAAAACG 
GATGAAAGCTATCTCTACCTTGCGATTAAAACAAAACCTGAAAAACTAAAAGAAAAACG 



TTATTACCAATAGATATTACACCAAAATCTGGTAGTAGAAAAATGAATGGTAGTAAGGT 
TTATTACCAATAGATATTACACCAAAATCTGGTAGTAGAAAAATGAATGGTAGTAAGGT 
TTATTACCAATAGATATTACACCAAAATCTGGTAGTAGAAAAATGAATGGTAGTAAGGT 
TTATTACCAATAGATATTACACCAAAATCTGGTAGTAGAAAAATG7UVTGGTAGTAAGGT 
' TTATTACCAATAGATATTACACCAAAATCTGGTAGTAGAAAAATGAATGGTAGTAAGGT 
TTATTACCAATAGATATTACACCAAAATCTGGTAGTAGAAAAATGAATGGTAGTAAGGT 
TTATTACCAATAGATATTACACCAAAATCTGGTAGTAGAAAAATGAATGGTAGTAAGGT 
TTATTACCAATAGATATTACACCAAAATCTGGTAGTAGAAAAATGAATGGTAGTAAGGT 
TTATTACCAATAGATATTACACCAAAATCTGGTAGTAGAAAAATGAATGGTAGTAAGGT 



ACATTTTCTAAATCTAGTGACTTTGTATTGTCTATTGATCCAAATGGCAAGTCTGAATT 
ACATTTTCTAAATCTAGTGACTTTGTATTGTCTATTGATCCAAATGGCAAGTCTGAATT 
ACATTTTCTAAATCTAGTGACTTTGTATTGTCTATTGATCCAAATGGCAAGTCTGAATT 
ACATTTTCTAAATCTAGTGACTTTGTATTGTCTATTGATCCAAATGGCAAGTCTGAATT 
ACATTTTCTAAATCTAGTGACTTTGTATTGTCTATTGATCC^AATGGCAAGTCTGAATT 
ACATTTTCTAAATCTAGTGACTTTGTATTGTCTATTGATCCAAATGGCAAGTCTGAATT 
ACATTTTCTAAATCTAGTGACTTTGTATTGTCTATTGATCCAAATGGCAAGTCTGAATT 
ACATTTTCTAAATCTAGTGACTTTGTATTGTCTATTGATCCAAATGGCAAGTCTGAATT 
ACATTTTCTAAATCTAGTGACTTTGTATTGTCTATTGATCCAAATGGCAAGTCTGAATT 



TTTGTCCAAGAGCGCTATAATGCCTTAAAAGCGAACTATCTTCGACAGCTTAACGGTAA 
TTTGTCCAAGAGCGCTATAATGCCTTAAAAGCGAACTATCTTCX3ACAGCTTAACGGTAA 
TTTGTCCAAGAGCGCTATAATGCCTTAAAAGCGAACTATCTTCGACAGCTTAACGGTAA 
TTTGTCCAAGAGCGCTATAATGCCTTAAAAGCGAACTATCTTCX3ACAGCTTAACGGTAA 
TTTGTCCAAGAGCGCTATAATGCCTTAAAAGCGAACTATCTTCGACAGCTTAACGGTAA 
TTTGTCCAAGAGCGCTATAACGCCTTAAAAGCGAACTATCTTCGACAGCTTAATGGTAA 
TTTGTCCAAGAGCGCTATAACGCCTTAAAAGCGAACTATCTTCGACAGCTTAATGGTAA 
TTTGTCCAAGAGCGCTATAATGCCTTAAAAGCGAACTATCTTCGACAGCTTAACGGTAA 
TTTGTCCAAGAGCGCTATAATGCCTTAAAAGCGAACTATCTTCGACAGCTTAACGGTAA 



GATTTTTATGCTTTCCCACCAAAGAAGAACAGTAGTAATTTTGAGCAGATAAATATGGT 

GATTTTTATGCTTTCCCACCAAAGAAGAACAGTAGTAATTTTGAGCAGATAAATATGGT 

GATTTTTATGCTTTCCCACCAAAGAAGAACAGTAGTAATTTTGAGCAGATAAATATGGT 

GATTTTTATGCTTTCCCACCAAAGAAGAACAGTAGTAATTTTGAGCAGATAAATATGGT 

GATTTTTATGCTTTCCCACCAAAGAAGAACAGTAGTAATTTTGAGCAGATAAATATGGT 

toTTOTATGCTTTCC C^CCAAAGAAGAACAGTAGTAATTTTGAGCAGATAAAT^^ 

GATTTTTATGCTTTCCCACCAAAGAAGAACAGTAGTAATTTTGAGCAGATAAATATGGT 

GATTTOT ATGCTTTCCCACCAAAGAAGA^ 

GATTTTTATGCTTTCCCACCAAAGAAGAACAGTAGTAATTTTGAGCAGATAAATATGGT 



TTGAGAAATACAAAGATTGTTGAAGACATGGAAAAAGTAAAAGCAACAGAGAGGTTCTT 
. TTGAGAAATACAAAGATTGTTGAAGACATGGAAAAAGTAAAAGCAACAGAGAGGTTCTT 



Table 28: Comparative Sequences relating to SAG155: 
(conserved hypothetical protein) 




SEQ2805 
SEQ2806 
SEQ2807 
SEQ2808 
SEQ2809 
SEQ2810 
SEQ2811 

SEQ2801 
SEQ2B02 
SEQ2803 
SEQ2B04 
SEQ2805 
SEQ2B06 
SEQ2807 
SEQ2808 
SEQ2809 
SEQ2810 
SEQ2811 

SEQ2801 
SEQ2802 
SEQ2803 
SEQ2804 
SEQ2805 
SEQ2806 
SEQ2807 
SEQ2808 
SEQ2809 
SEQ2810 
SEQ2811 

SEQ2801 
SEQ2802 
SEQ2803 
SEQ2804 
SEQ2805 
SEQ2806 
SEQ2807 
SEQ2808 
SEQ2809 
SEQ2810 
SEQ2811 

SEQ280X 
SEQ2802 
8EQ2803 
SEQ2804 
SEQ2805 
SEQ2806 
SEQ2807 
SEQ2808 
SEQ2809 
SEQ2810 
SEQ2811 

SEQ2801 
SEQ2802 
SEQ2803 
SEQ2804 
SEQ2805 
SEQ2806 
SEQ2807 
SEQ2808 
SEQ2809 
SEQ2810 
SEQ2811 



TTGAGAAATACAAAGATTGTTGAAGAC^TGGAAAAAGTAAAAGCAACAGAGAGGTTCTT 
TTGAGAAATACAAAGATTGTTGAAGACATGGAAAAAGTAAAAGCAACAGAGAGGTTCTT 
TTGAGAAATACAAAGATTGTTGAAGACATGGAAAAAGTAAAAGCAACAGAGAGGTTCTT 
TTGAGAAATACAAAGATTGTTGAAGACATGGAAAAAGTAAAAGCAACAGAGAGGTTCTT 
TTGAGAAATACAAAGATTGTTGAAGACATGGAAAAAGTAAAAGCAACAGAGAGGTTCTT 
TTGAGAAATACAAAGATTGTTGAAGACATGGAAAAAGTAAAAGCAACAGAGAGGTTCTT 
TTGAGAAATACAAAGATTGTTGAAGACATGGAAAAAGTAAAAGCAACAGAGAGGTTCTT 



CCAACTCATCCTACTGGTCTTCTCAAAACAGGAACAACTGATAGGCACCAAAAAACATT 
CCRACTCATCCTACTGGTCTTCTCAAAACAGGAACAACTGATAGGCACCAAAAAACATT 
CCAACTCATCCTACTGGTCTTCTCAAAACAGGAACAACTGATAGGCACCAAAAAACATT 
CCAACTCATCCTACTGGTCTTCTCAAAACAGGAACAACTGATAGGCACCAAAAAACATT 
CCAACTCATCCTACTGGTCTTCTCAAAACAGGAACAACTGATAGGCACCAAAAAACATT 
CCAACTCATCCTACTGGTCTTCTCAAAAC^GGAAC^CTGATAGGCACCAAAAAACATT 
CCAACTCATCCTACTGGTCTTCTCAAAACAGGAACAACTGATAGGCACCAAAAAACATT 
CCAACTCATCCTACTGGTCTTCTCAAAACAGGAACAACTGATAGGCACCAAAAAACATT 
CCAACTCATCCTACTGGTCTTCTCAAAACAGGAACAACTGATAGGCACCAAAAAACATT 



GATTCACAAACAGATATTTCGTTTGGAAAGGACTTTATAGAGGTCAGAATTCCGTGGCA 
GATTCACAAACAGATATTTCGTTTGGAAAGGACTTTATAGAGGTCAGAATTCCGTGGCA 
GATTCACAAACAGATATTTCGTTTGGAAAGGACTTTATAGAGGTCAGAATTCCGTGGCA 
GATTCACAAACAGATATTTCGTTTGGAAAGGACTTTATAGAGGTCAGAATTCCGTGGCA 
GATTCACAACCAGATATTTCGTTTGGAAAGGACTTTATAGAGGTCAGAATTCCGTGGCA 
GATTCACAAACAGATATTTCGTTTGGAAAGGACTTTATAGAGGTCAGAATTCCGTGGCA 
GATTCACAAACAGATATTTCGTTTGGAAAGGACTTTATAGAGGTCAGAATTCCGTGGCA 
GATTCACAAACAGATATTTCGTTTGGAAAGGACTTTATAGAGGTCAGAATTCCGTGGCA 
GATTCACAAACAGATATTTCGTTTGGAAAGGACTTTATAGAGGTCAGAATTCCGTGGCA 



TTGTTGAATTTTTCTGATCCATCATCTCAAAAAATTCACGATGATTACTTTAAACATTA 
TTGTTGAATTTTTCTGATCCATCATCTCAAAAAATTCACGATGATTACTTTAAACATTA 
TTGTTGAATTTTTCTGATCCATCATCTCAAAGAATTCACGATGATTACTTTAAACATTA 
TTGTTGAATTTTTCTGATCCATCATCTCAAAAAATTCACGATGATTACTTTAAACATTA 
TTGTTGAATTTTTCTGATCCATCATCTCAAAAAATTCACGATGATTACTTTAAACATTA 
TTGTT GAATTTTT CTGAT CCAT CATCT CAAAAAATTCACGATGATT ACTTT AAACATT A 
TTGTTGAATTTTTCTGATCCATCATCTCAAAAAATTCACGATGATTACTTTAAACATTA 
TTGTTGAATTTTTCTGATCCATCATCTCAAAAAATTCACGATGATTACTTTAAACATTA 
TTGTTGAATTTTTCTGATCCATCATCTCAAAAAATTCACGATGATTACTTTAAACATTA 



GGTGTGAAGGAGTTAGAAATTGAGAG — CATTGCTTTAGGATTAGGTGCTAATAGCAAA 
GGTGTGAAGGAGTTAGAAATTGAGAG — CATTGCTTTAGGATTAGGTGCTAATAGCAAA 
GGTGTGAAGGAGTTAGAAAATTGAGAGCCATTGCTTTAGGATTAGGTGCTAATAGCAAA 
GGTGTGAAGGAGTTAGAAATTGAGAG — CATTGCTTTAGGATTAGGTGCTAATAGCAAA 
GGTGTGAAGGAGTTAGAAATTGAGAG — CATTGCTTTAGGATTAGGTGCTAATAGCAAA 
GGTGTGAAGGAGTTAGAAATTGAGAG— CATTGCTTTAGGATTAGGTGCTAATAGCAAA 
GGTGTGAAGGAGTTAGAAATTGAGAG — CATTGCTTTAGGATTAGGTGCTAATAGCAAA 
GGTGTGAAGGAGTTAGAAATTGAGAG — CATTGCTTTAGGATTAGGTGCTAATAGCAAA 
GGTGTGAAGGAGTTAGAAATTGAGAG— CATTGCTTTAGGATTAGGTGCTAATAGCAAA 



AAAACACACTGATAAAGATGGCAGATTATCGTTTGAAAAATTGGGAGAGACCCGATACC 
AAAACACACTGATAAAGATGGCAGATTATCGTTTGAAAAATTGGGAGAGACCCGATACC 
AAAACACACTGATAAAGATGGCAGATTATCGTTTGAAAAATTGGGAGAGACCCGATACC 
AAAACACACTGATAAAGATGGCAGATTATCGTTTGAAAAATTGGGAGAGACCCGATACC 
AAAACACACTGATAAAGATGGCAGATTATCGTTTGAAAAATTGGGAGAGACCCGATACC 
AAAACACACTGATAAAGATGGCAGATTATCGTTTGAAAAATTGGGAGAGACCCGATACC 
AAAACACACTGATAAAGATGGCAGATTATCGTTTGAAAAATTGGGAGAGACCCGATACC 
AAAACACACTGATAAAGATGGCAGATTATCGTTTGAAAAATTGGGAGAGACCCGATACC 
7VAAACACACTGATAAAGATGGCAGATTATCGTTTGAAAAATTGGGAGAGACCCGATACC 



Table 28: Comparative Sequences relating to SAG1552 
(conserved hypothetical pr tein) 



55^ 



SEQ2801 
SEQ2802 
SEQ2B03 
SEQ2804 



AAACCTTTTTAAAAGACTCCTATTATGTATTAAGAAAGAA 

AAACCTTTTTAAAAGACTCCTATTATAGTATTAAGAAAGAATGGTCTAAAGAAAGAGAG 

SEQ2805 AAACCTTTTTAAAAGA 

SEQ2806 AAACCTTTTTAAAAGACTCCTATTATGTATTAAGAAAGA 

SEQ2807 AAACCTTTTTAAAAGACT 

SEQ2808 AAACCTTTTTAAAAGACTCCTATTATAGT 

SEQ2809 AAACCTTTTTAAAAGACTCCTATTATAGTATTAAGAAAG 

SEO2810 AAACCTTTTTAAAAGACTCCTATTATAGTATTAAG 

SEQ2811 aaaccTTTTTAAAAGACTCCTATTATAGTATTAAGAAAGAATGG 



SEQ2801 

SEQ2802 

SEQ2803 

SEQ2804 GAACATATGGTCCA 

SEQ2805 

SEQ2806 

SEQ2807 

SEQ2808 

SEQ2809 

SBQ2810 

SEQ2811 




KVPMNVAFYDALYHHNKASKRPLYLLQGIRIDSYRNNASITAFNDNYRGY 
IIJ1GRKQVWNTDFGSRHYHYDLS PWVI^YWGDDWNSGTVAYTNHQEKKTQYI^^YEKTS 
AAAN PFEVMLAQVMDEI*THYETAKYGWQHLI S FSN S PTTDPFRYRKPFEAQAPKYVQLNV 
ENIQANSNVKAGIFAAYKAIDFHPRYKDYLLTOKENISKEDRQKIKELSLSQ^KLLNA 
YHKI PVLVTGYGYSTARGIAQKEI DKRPLPINEKEQGQRLLEDYES FI S SGS FGAT INAW 
QDDWNARAWNTSFATNKHSQFLWGDAQVFNQGYGLLGF 

MTS ATGDDLYAS SDESYLYIiAIKTKPEKLKEKRLLPI DIT PKSGSRKMNGSKVTFSKS S D 
FVLSIDPNGKSELFVQERYNALKANYLRQLNGKDFYAFPPKKNSSNFEQINMVLRNTKIV 

EDMEKVKATERFLPTHPTGLLKTGT I DRHQKTFDSQTDIS FGKDFIEVRI p ^QkI*N E« e v 
SSQKIHDDYFKHYGVKELEIESIALGLGANSKENTLIKMADYRIiKNWERPDTKTFL^ 

YSI.ER 

>SEQ ID NO 2851:62 18RS21 frame: 1 

KGLIOTNTRTNFVWGDTVLHKPTNKPFWKGVDVESSLAGYHHNDFPITQKTYREWFHL 

I SNMGANTVRVKVPMNVAFYDALYHHNKASKRPLYLLQGIRI DS YRNN AS ITAFNDNYRG 
YLKREAKG W DI LHGRKQVWNTDLG SRHYHY DLS P WVLGYWGD DWN S GTVAYTNHQEKK 
TQYKGRYFKTSVAANPFEVMLAQVMDELTHYETAKYGWQHLISFSNSPTTDPFHYRKPFE 
AQAPKYVQLNVENIQAN SNVKAGMFAAYKAI DFHPRYKDYLLFDKENISKEDRQKI KELS 
LSQGYVKLLNAYHKIPVLVTGYGYSTARGIAQKEIDKRPLPINEKEQGQRLLEDYESFIS 
SGSFGATINAWQDDWNARAWNTSFATNKHSQFLWGDAQVFNQGYGLLGFKNAKHHYQVDG 
KRGKGEWKHPLMTSATGDDLYASSDESYLYIiAIKTKPEKIjKEKRIiLPIDITPKSGSRKMN 
GSKVTFSKSSDFVLSIDPNGKSELFVQERYNALKANYIJIQI^GKDE^AFPPKKNSSNFEQ 
INMVLRNTKIVEDMEKVKATERFLPTHPTGLLKTGTTDRHQKTFDSQTDISFGKDFIEVR 
IPWQLLNFSDPSSQKIHDDYFKHYGVKELEIESIALGLGANSKENTLIKMADYRLKNWER 

PDTKTFLKDSYYVLRK 



>SEQ ID NO 2852:62 2603 frame: 3 

LKENTRTNFVVKGDTVLHKPTNKPFVVKGVDVE S SLAGYHRNDFP I TQKT YREW FHLI SN 
MGANTVRVKVPMNVAPYDALYHHNKASKRPLYLLQGIRIDSYRNNASITAFNDNYBGY^ 
REAKGWDIIiRGRKQVWNTDLGSRHYHYDLSPWVLGYVVGDDWNSGTVAYTNHQEKKTQY 
KGRYFKTSVAANPFEVMLAQVMDELTHYETAKYGWQHLISFSNSPTTDPFHYRKPFEAQA 
PKYVOIiNVEN IQAN SNVKAGMFAAYKAI DFRPRYKDYLLFDKENI SKEDRQKIKELSLSQ 
GYVKLLNAYHKI PVLVTGYGY STARGI AQKE I DKRPLPINEKEQGQRLLEDYES FIS SGS 
— *ni-ritn»An rkt.ivT n o 7\TiTvtrr g cnmjK hr^rt .Wf? D AOVFNOG YGLLG FKNAKHHYQVDGKRG 



KGEWKHPLMTSATGDDLYASSDESYLYLAIKTKPEKliKKtuu,L,irxLrx x irxvoi^^^^v 

VTFSKSSDFVLSIDPNGKSELFVQERYNAUCANYLRQIiNGKDFYAFPPKKNSSNFEQINM 

VLRNTKIVEDMEKVKATERFLPTHPTGI^KTGTTDRHQKTFDSQTDISFGKDFIEVRIPW 

QLLNFSDPSSQKIHDDYFKHYGVKELEIESIALGIX3ANSKENTLIKMADYRLKIW 

KTFLKDSYYSIKKEWSKERERTYGP 
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Table 28: C mparative Sequences relating to SAG1552 
(conserved hypothetical protein) 



t 



>SEO ID NO 2853:62 A909 frame: 1 
KGLLKENTRTNFVVKGDTVLHKPTNKPFVVKGVDVESSi^ 

I SNMGANTVRVKTOMNVAFYDALYHHNKASKRPLYLLQGIRI DSYRNNAS ITAFNDNYRG 

YLKREAKGVVDILHGRKQVWNT DLG SRHYHYDLS PWVLGY WGDDWN SGTVAYTNHQEKK 
TQYKGRYFKTSVAANPFBVMIAQVMDELTHYETAKYGWQHLISFSNSPTTDPFHYRCTFE 

AQAPKYVQLNVENIQAN SNVKAGMFAAYKAI DFHPRYKDYLLFDKEN I SKE DRQKI 
LSQGYVKLLN AYHKI PVLVTGYGYSTARGI AQKE I DKRP L P INEKEQGQRLLE DYE S FI S 
SGSFGATINAWQDDWNARAWNTSFATNKHSQFLWGDAQVFNQGYGLLGFKNAKHHYQVDG 
KRGKGEWKHPLMTSATGDDIjYASS DE SYLYLAIKTKPEKLKEKRLLPIDITPKSGSRKMN 
GSKVTFSKSSDFVLSIDPNGKSELFVQERYNALKANYLRQLNGKDFYAFPPKKNSSNFEQ 
INl^LRNTKIVEDMEKVKATERFLPTHPTGLLKTGTTDRHQKTFDSQTDISFGKDFIEVR 
IPWQLLNFSDPSSQRIHDDYFKHYGVKELEN . EPLL . D . VLIAKKTH . - RWQIIV . KIGR 
DPIPKPF.K 

>SEQ ID NO 2854:62 A909 frame: 1 ^_™„„«„ OT t 
KGLLKENTRTNFVVKGDTVLHKPTNKPFVVKGVDVESSIiAGYHHNDFPITQKTYMWraL 

isnmgantvrvkvpmnvafydalyhhnkaskrplyllqgiridsyr 

YLKREAKGWD I LHGRKQVWNTDLG SRHYHYDLS PWVLGYWGDDWNSGTVAYTNHQEKK 
TQYKGRYFKTSVAANPFEVMLAQVMDELTHYETAKYGWQHLISFSNSPTTDPFHYRKPFE 
AQAPKYVQLNVENIQANSNVKAGMFAAYKAIDFHPRYKDYLLFDKENISKEDRQKIKELS 

LSQGYVKLLNAYHKI PVLVTGYGYSTARGI AQKE I DKRPL P INEKEQGQRLLE DYE SFIS 
SGSFGATIimWQDDWNARAWNTSFATNKHSQFLWGDAQVFNQGYGLLGFKNAKHHYQVDG 
KRGKGEWKHPIKTSATGDDLYASSDESYLYLAIKTKPEKLKEKRLLPIDITPKSGSRK^ 
GSKVTFSKS SDFVLS I DPNGKSELFVQERYNALKANYLRQLNGKDFYAFPPKKN S SNFEQ 
INMVLRNTKI VEDMEKVKATERFLPTHPTGLLKTGTTDRHQKT FDSQTDI SFGKDFIEVR 
IPWQLLNFSDPS SQRIHDDYFKHYGVKELEN . EPLL . D . VLIAKKTH . . RWQIIV . KIGR 
DPIPKPF.K 

>SEQ ID NO 2855:62jMBllO frame: 1 ^«.„ Tm 
PKGLLKENTRTN FWKGDTVLHKPTNKPFWKGV DVES SLAGYHHNDFPIT 
LI SNMGANTVRVKVPMNW PvnnrvHHMtfaWRPT.VUiJ^ TTSVS YB H ti<\S 
3YLKREAKGVVDILHGRK< 
KTQYKGRYFKTSVAANPF] 
EAQAPKYVQLNVENIQAN 
SLSQGYVKLLN AYHKI PV 
SSGSFGATINAWQDDWNA 
GKRGKGEWKHPLMTSATG 
[NGSKVTFSKS S DFVLS I D 
IQINMVLRNTKIVEDMEKV 
or uixvz i/RI PWQLLNFS DPSSQKIH 
ADYRLKNWERPDTKTFLKDSYYVLRK 



YYFDGS LYLPKGLLKENTRTN FWKG DT VLHKPTNKP FWKGV UVbS J> urn* x nnwtJ" jl x 
QKTYREWFHLISI^GANTTOVKVPI^AI^DALYHHNKASKRPL 

ITAFNDNYRGYLKREAKGWDILHGRKQVWNT D FGSRHYHYDLSPWVLGYWGDDWNSGT 
VAYTNHQEKKTQYKGRYFKTSVAANPFEVMIJ^^^ 

DP FH YRKPFEAQAPKYVQLNVENIQANStWKAGMFAAYKAI DFHPRYKDYLLFDKEN I SK 
EDRQKIKEI^LSQGYVKLLNAYHKIPVLVTGYGYSTARGIAQKEIDKRPLP^^QG^ 
LLEDYESFISSGSFGATINAWQDDWNARAWNTSFATNKHNQFLWGDAQVEiaQGYGLLGFK 
NAKHHYQVDGKRGKGEV^PLMTSATGDDLYASSDESYLYIAIKTKPEKLKEKRLLPIDI 
TPKSGSRKMNGSKVTFSKSS DFVLS I DPNGKSELFVQERYNALKANYLRQLNGKDFYAFP 
PKKNSSNFEQINMVLRNTKIVEDMEKVKATERFLOT^ 

SFGKDFIETOIPWQLLNFSDPSSQKIHDDYFKRYGVKELEIESIALGLGANSKENTLIKM 

— -r wumtn T% Tl nm VIP 1PT VT\ C V"V\7T .O V I 



ERPDTKTFLKD 

>SEQ 3D NO 2857:62 H36B frame: 2 

RGLLKENTRTNFVVKGDTVLHKPTNKPFVVKGVDVE S SLAGYHHNDFPITQKT YREWFHL 
ISNMGANTVRVKVPMNVAETDALYHHNKASKRPLYLLQGIRIDSYRNNASITAFNDNYR^ 
YLKREAKGWDILHGRKQVWNTDFGSSHYHYDLS PWVLGYWGDDGHSGTVALY 

>SEQ ID NO 2858:62 OM9130013 frame: 3 Mfmm „ 
FWKG DTVLHK PTNKP FWKGV DVE S S LAGYHHN D FP ITQKT YREWFHLI SNMG ANTVRV 
KVPMOTAFYDALYHHNKASKRPLYLI^GIRIDSYRNNASITAFNDNYRGYLKRi^GWD 
ILHGRKQVWNT DFGS SHYHYDLS PWVLGYWGDDWNSGTVAYTNHQEKKTQYKGRYFKTS 
VAANP FEVMLAQVMDELTHYETAKYGWQHLIS FSN S PTTDPFHYRKPFEAQAPKYVQLN V 
ENIQANSNVKAGMFAAYKAI DFHPRYKDYIjLFDKEN ISKEDRQKIKELSLSQGYVKLLNA 
vtTtr I PVLVTGYGYS TARGIAQKE I DKRPLP INEKEQGQRLLE DYE SFISSGSFGAT IN AW 



YHKIE 
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c* u m o & & > - j jB> o a sa x ts ia 

Table 28: Comparative Sequences relating to SAGISS^^ 
(conserved hypothetical protein) 

QDDWNAR\^TSFATNKHSQFLWGDAQVFNQGYGLLGFKNAKHHYQVDGKRGKEEWKHPL 
MTSATGDDLYASSDESYLYLAIKTKPEKLKEKRLLPIDITPKSGSRKMNGSKVTFSKSSD 
FVLSIDPNGKSELFVQERYNMiKANYI^QLNGKDFYAFPPKKNSSNraQINMVLRNTKIV 
EDMEKVKATERFLPTHPTGLLKTGTTDRHQKT FDSQTDI S FGKDFI EVRI PWQLLNFS DP 
SSQKIHDDYFKHYGVKELEIESIALGLGANSKENTLIKMADYRLKNWERPDTKTFLKDSY 

YSIKK 

>SEQ ZD HO 2859:62 M732 frame: 2 

TRTNEVVKGDTVLHKPTNKPFVVKGVDVESSLAGYHHNDFPITQKTYREWFHLISNMGAN 
TVRVKVPMNVAFYDALYHHNKESKRPLYLLQG IRI DSYRNNAS ITAFN DNYRG YLKRE AK 
GWDILRGRKQVWNTDFGSRHYHYDLSPWVLGYWGDDCNSGTVAYTNHQEKKTQYKGRY 
FKTS VAAN PFEVMLAQVMDELTHYETAKYGWQHLI SFSNS PTTDPFHYRKPFE AQAPKYV 
QLNVENIQANSNVKAGMFAAYKAI DFHPRYKDYLLFDKENI SKEDRQKIKELS LS QGYVK 
LLNAYHKI PVLVTGYGYSTARGIAQKEI DKRPLPINEKEQGQRLLEDYESFI S SGS FGAT 
INAWQDDWNARAWNTSFATNKHSQFLWGDAQVFNQGYGLLGFKNAKHHYQVDGKRGKGEW 
KHPLMTSATGDDLYASSDESYLYLAIKTKPEKLKEKRLLPIDITPKSGSRKMNGSKVTFS 
KS S DFVLS I DPNGKSELFVQERYNALKANYLRQLNGKDFYAFPPKKNS SNFEQINMVLRN 
TKIVEDMEKVKATERFLPTHPTGLLKTGTTDRHQKTFDSQTDISFGKDFIEVRIPWQLLN 
FSDPSSQKIHDDYFKHYGVKELEIESIALGLGANSKENTLIKMADYRLKNWERPDTKTFL 

KDSYYSIK 

>SEQ ID MO 2860:62__M781 frame: X 

FDGSLYLPQGLLKENTRTNFVVKGDTVLHKPTNKPFVVKGVDVESSLAGYHHNDFPITQK 
TYREWFHLISNMGANTVRVKVPMNVAFYDALYHHNKESKRPLYLLQGIRIDSYRNNASIT 
AFNDNYRGYLKREAKGWDILKGRKQVWNTDFGSRHYHYDLSPWVLGYVVGDDWNSGTVA 
YTNHQEKKTQYKGRYFKTSVAANPFEVMLAQVMDELTHYETAKYGWQHLISFSNSPTTDP 
FHYRKPFEAQAPKYVQLNVENIQANSNVKAGMFAAYKAIDFHPRYKDYLLFDKENISKED 
RQKIKELSLSQGYVKLLNAYHKIPVLVTGYGYSTARGIAQKEIDKRPLPINEKEQGQRLL 
EDYES FI S SGSFGATINAWQDDWNARAWNTS FATNKHSQFLWGDAQVFNQGYGLLGFKN A 
KHHYQVDGKRGKGEWKHPmTSATGDDLYASSDESYLYLAIKTKPEKIiKEKRLLPIDITP 
KSGSKKMNGSKSTTFSKSSDFVLSIDPNGKSELFVQERYNALKANYLRQLNGKDFYAFPPK 
KNS SNFEQINMVLRNTKI VE DMEKVKATERFLPTHPTGLLKTGTTDRHQKT FDSQTDI S F 
GKDFIEVRI PWQLLNFSDPS SQKIHDDY FKHYGVKELE IESIALGLGANSKENTLIKMAD 
YRLKNWERPDTKTFLKDSYYSIKKEW 



SEQ2850 FWKGDTVLHKPTNKPFWKGVDVESSLAGYHHNDFPIT 

SEQ2851 KGLLKENTRTNEWKGDTVLHKPTNKPFVVKGVDVESSLAGYHHNDFPIT 

SEQ2852 LKENTRTNFVVKGDTVLHKPTNKPFVVKGVDVESSLAGYHHNDFPIT 

SEQ2853 KGLLKENTRTN FWKG DT VLHKPTNKPFWKGV DVE S S LAG YHHN DFP I T 

SEQ2854 KGLLKENTRTNFVVKGDTVLHKPTNKPFVVKGVDVESSLAGYHHNDFPIT 

SEQ2855 YFDGSLYLPKGLLKENTRTNFWKGDTVLHKPTNKPFWKGVDVESSLAGYHHNDFPIT 

SEQ2856 LPQGLLKENTRTNFWKGDTVLHKPTNKPFWKGVDVES S LAGYHHN DFPIT 

SEQ2857 RGLLKENTRTNFVVKGDTVLHKPTNKPFVVKGVDVESS LAGYHHN DFP IT 

SEQ2858 FWKGDTVLHKPTNKPFWKGVDVESSLAGYHHNDFPIT 

SEQ2859 TRTNFWKGDTVLHKPTNKPFWKGVDVESSLAGYHHNDFPIT 

SEQ2860 - FDGSLYLPQGLLKENTRTN FWKGDT VLHKPTNKPFWKGVDVES SIAGYHHNDFPIT 

SEQ2850 QKTYREWFHLISNMGANTVRVKVPMNVAFYDALYHHNKASKRPLYLLQGIRIDSYRNNAS 

SEQ2851 QKTYREWFHLISNMGANTVRVKVPMNVAFYDALYHHNKASKRPLYLLQGIRI DSYRNNAS 

SEQ2852 QKTYREWFHLI SNMGANTVRVKVPMNVAFYDALYHHNKASKRPLYLLQGI RIDS YRNNAS 

SEQ2853 QKTYREWFHLI SITOGANTVRVKVPMNVAFYDALYHHNKASKRPLYLLQGIRI DS YRNNAS 

SEQ2854 QKTYREWFHLISNMGANTVRVKVPMNVAFYDALYHHNKASKRPLYLLQGIRIDSYRNNAS 

SEQ2855 QKT YREWFHLI SNMGANTVRVKVPMNVAFYDALYHHNKASKRPLYLLQGIRI DS YRNNAS 

SEQ2856 QKTYREWFHLISNMGANTVRVKVPMNVAFYDALYHHNKE SKRPLYLLQGIRI DS YRNNAS 

SEQ2857 QKTYREWFHLI SNMGANTVRVKVPMNVAFYDALYHHNKASKRPLYLLQGIRI DS YRNNAS 

SEQ2858 QKTYREWFHLISNMGANTVRVKVPMNVAFYDALYHHNKASKRPLYLLQGIRIDSYRNNAS 

SEQ2859 QKTYREWFHLI SNMGANTVRVKVPMNVAFYDALYHHNKE SKRPLYLLQGIRI DSYRNNAS 

SEQ28 60 QKTYREWFHLI SNMGANTVRVKVPMNVAFYDALYHHNKE SKRPLYLLQGIRI DSYRNNAS 

SEQ2850 ITAFNDNYRGYLKREAKGWDILHGRKQVWNTDFGSRHYHYDLSPWVLGYWGDDWNSGT 

SEQ2851 I TAFNDNYRG YLKREAKGWDI LHGRKQVWN T DLGS RH YHYDL S PW VLG YVVG D DWN S GT 

SEQ2852 ITAFNDNYRGYLKREAKGWDILHGRKQVWNTDLGSRHYHYDLSPWVLG YWGDDWNSGT 

SEQ2853 IT AFNDNYRGYLKREAKGWDI LHGRKQVWNTDLGSRHYHYDLS PWVLGYWGDDWN SGT 

SEQ2854 ITAFNDNYRGYLKREAKGWDILHGRKQVWNTDLGSRHYHYDLSPWVLGYWGDDWNSGT 

SEQ2855 ITAFNDNYRGYLKREAKGWDI LHGRKQVWNTDFGSRHYHYDLS PWVLG YWGDDWNSGT 

SEQ285 6 ITAFNDNYRGYLKREAKGWDILHGRKQVWNTDFGSRHYHYDLSPWVLG YWGDDWNSGT 

SEQ2857 I T AFN DN YRG YLKREAKG WD I LHGRKQVWNT DFG S SHYHYDLSPWVLGYWGDDGHSGT 

SEQ2858 ITAFNDNYRGYLKREAKGWD I LHGRKQVWNTDFGS SHYHYDLS PWVLGYWGDDWNSGT 
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SEQ2859 
SEQ2860 
SEQ2850 
SEQ2851 
SEQ2852 
SEQ2853 
SEQ2854 
SEQ2855 
SEQ2856 
SEQ2857 
SEQ2858 
SEQ2859 
SEQ2660 

SEQ2850 
SEQ2851 
SEQ2852 
SEQ2853 
SEQ2854 
SEQ28S5 
SEQ2856 
SEQ2857 
SEQ2858 
SEQ2859 
SEQ2860 

SEQ2850 
SEQ285X 
SEQ2852 
SEQ2853 
SEQ2854 
SEQ2855 
SEQ2856 
SEQ2857 
SEQ2858 
SEQ2859 
SEQ2860 

SEQ2850 
SEQ2851 
SEQ2852 
SEQ2853 
SEQ2854 
SEQ2855 
SEQ2856 
SEQ2857 
SEQ2858 
SEQ2859 
SEQ2860 

SEQ2850 
SEQ2851 
SEQ2852 
SEQ2853 
SEQ2854 
SEQ2855 
SEQ2856 
SEQ2857 
SEQ2858 
SEQ2859 
SEQ2860 

SEQ2850 
SEQ2851 
SEQ2852 
SEQ2853 
SEQ2854 
SEQ2855 



Table 28: Comparative Sequ nces relating to SAG1552 
(conserved hyp thetical protein) 

ITAFN DNYRGYLKREAKGVVDILHGRKQVWNTDFGSRHYHYDLS PWVLGYWGDDCNSGT 
IT^DNYRGYLKREAKGWDILHGWCQVWNTDEXSSRHYHYDLSPOTI^YWGDDWNSGT 
VAYTNHOEKKTQYKGRYFKTSAAANPFEVMLAQVMDELTHYETAKYGWQHLISFSNSPTT 
VAYTNHOEKKTQYKGRY FKTSVAAN PFEVMLAQVMDELTHYETAKYGWQHLI S FSNS PTT 
VAYTNHOEKKTQYKGRY FKT S VAANPFEVMLAQVMDELTHYET AKYGWQHLI S FSN S PTT 
VAYTNHQEKKTQYKGRYFKTSVAANPFEVMLAQVMDELTHYETAKYGWQHLISFSNSPTT 

VAYTN HQEKKTQ YKGRY FKT S VAAN P FEVMLAQVMDELT HYETAKY GWQH L I S FSN S PTT 
VAYTNHQEKKTQYKGRYFKTSVAANPFEVMLAQVMDELTHYETAKYGWQHLI S FSN S PTT 
VAYTNHQEKKTOYKGRYFKTSVAANPFEVMIAQVMDELTHYETAKYGWQHLISFSNSPTT 

VAYTOHQE^ 

VAYTNHQEKKTQYKGRYFKTSVAANPFEWIAQVMDELTHYBTAKYGWQHLISFSNSPTT 
VAYTNHQEKKTQYKGRYFKTSVAANPFEVMLAQVMDELTHYETAKYGWQHLISFSNSPTT 

PFRYRKPFEAQAPKYVQLNVEN IQAN SNVKAG IFAAYKAIDFHPRYKDYLLFDKENISK 
PFHYRKP FEAQAPKYVQLNVEN IQANSNVKAGMFAAYKAI D FHPRYKDYLLFDKEN I SK 
PFHYRKPFEAQAPKYVQLNVENIQANSNVKAGMFAAYKAIDFHPRYKDYLLFDKENISK 
PFHYRKPFEAQAPKYVQLNVENIQANSNVKAGMFAAYKAIDFHPRYKDYLLFDKENISK 
PE11YRKPFEAQAPKWQLNVENIQANSNVKAGMFAAYKAIDFHPRYKDYLLFDKENISK 
PFHYRKPFEAQAPKYVQLNVEN IQANSNVKAGMFAAYKAIDFHPRYKDYLLFDKEN I SK ( 
PFHYRKPFEAQAPKYVQLNVENIQANSN VKAGMFAAYKAI DFHPRYKDYLLFDKEN ISK 

PFHYRKPraAQAPKYVQLNVEN IQANSNVKAGMFAAYKAIDFHPRYKDYLLFDKEN I SK 
PFHYRKPFEAQAPKYVQLNVENIQANSNVKAGMFAAYKAI DFHPRYKDYLLFDKEN ISK 
P FHYRKPFEAQAPKYVQLNVEN IQAN SNVKAGMFAAYKAI DFHPRYKDYLLFDKEN I SK 

DRQKIKELSLSQGYVKLLNAYHKI PVLVTGYG YSTARGI AQKEI DKRPLP INEKEQGQR 
DRQKIKELSLSQGYVKLLNAYHKIPVLVTGYGYSTARGIAQKEIDKRPLPINEKEQGQR 
DRQKIKELSLSQGYVKLLNAYHKI PVLVTGYGYSTARGIAQKEI DKRPLPINEKEQGQR 
DRQKIKELSLSQGYVKLLNAYHKI PVLVTGYGYSTARGIAQKEIDKRPLPINEi^QGQR 
DRQKIKELSLSQGYVKLLNAYHKIPVLVTGYGYSTARGIAQKEIDKRPLPINEKEQGQR 
DRQKIKELSLSQGYVKLLNAYHKI PVLVTGYGYSTARGIAQKEI DKRPLPINEKEQGQR 
DRQKIKELS LSQGYVKLLNAYHKI PVLVTGYGYSTARGIAQKEI DKRPLPINEKEQGQR 

DRQKIl^LSLSQG^ 

DRQKIKELS LSQGYVKLLNAYHKI PVLVTGYGYSTARGIAQKEI DKRPLPINEKEQGQR 
DRQKI KELS LS QGYVKLLN AYHKI PVLVTGYGY STARGI AQKEI DKRPLPINEKEQGQR 

LE DYES FI S SGSFGAT INAWQDDWNARAWNT S FATNKHSQFLWGDAQVFNQGYGLLGFK 
LEDYESFISSGSFGATINAWQDDWNARAWNTSFATNKHSQFLWGDAQVFNQGYGLLGFK 
LEDYESFISSGSFGATINAWQDDWNARAWNTSFATNKHSQFLWGDAQVFNQGYGLLGre 
LE DYE S FI S S G S FGAT INAWQDDWNARAWNT S FATNKHS QFLWG D AQV FNQG YGLLGFK 
LEDYES FIS SGS FGAT INAWQDDWNARAWNTSFATNKHSQFLWGDAQVFNQGYGLLGFK 
LEDYESFI SSGSFGATINAWQDDWNARAWNTS EATNKHNQFLWGDAQVFNQGYGLLGFK 
LEDYESFI S SGSFGAT INAWQDDWNARAWNT S FATNKHSQFLWGDAQVFNQGYGLLGFK 

^DTOSFISSGS^TINAWQDDWNARVWNTS FATNKHSQFLWGDAQVFNQGYGLLGFK 
LEDYE S FI SSGS FGAT INAWQDDWNARAWNT S FATNKHSQFLWGDAQVFNQGYGLLGFK 
LEDYESFISSGS FGAT INAWQDDWN ARAWNT S FATNKHSQFLWGDAQVFNQGYGLLGFK 

AKHHYQVDGKRGKGEWKHPLMTS ATGDDLYAS SDE S YLYLAIKTKPEKLKEKRLLPI DI 
AKHHYQVDGKRGKGEWKHPLMTSATGDDLYASSDESYLYLAIKTKPEKLKEKRLLPIDI 
AKHHYQVDGKRGKGEWKHPLMTS ATGDDLYAS S DESYLYLAIKTKPEKLKEKRLLPIDI 
AKHHYQVDGKRGKGEWKHPIWTSATGDDLYASSDESYLYLAIKTKPEKLKEKRLLPIDI 
AKHHYQVDGKRGKGEWKHPIKTSATGDDLYASSDESYLYLAIKTKPEKLKEKRLLPIDI 
AKHHYQVDGKRGKGEWKHPLMTSATGDDLYASSDESYLYLAIKTKPEKLKEKRLLPIDI 
AKHHYQVDGKRGKGEWKHPL^5TSATGDDLYASSDESYLYIJ^KTKPEKLKEKRLLPIDI 

AIOTHYQVDGraGKEEWra 

AKHHYQVDGKRGKGEWKHPLMTSATGDDLYASSDESYLYLAIKTKPEKLKEKRLLPIDI 
AKHHYQVDGKRGKGEWKHPLMTSATGDDLYASSDESYLYLAIKTKPEKLKEKRLLPIDI 

pksgsrkmngskvtfskssdf^sidpngkselevqerynalk™ 

PKSGSRKMNGSKVTFSKSSDFVLSIDPNGKSELFVQERYNAIJCANYLRQ^^D^AFP 
PKSGSRKMNGSKVTFSKSSDFVLSIDPNGKSELFVQER 

PKSGSRKMNGSKVTFSKSS DFVLS I DPNGKSELFVQERYNALKAN YUIQLNGKDFYAFP 

PKSGSRKMNGSKVTFSKSSDFVLSIDPNGKSELFVQERYNAL^ 

PKSGSRKMNGSKVTFSKSSDFVIiSIDPNGKSELFVQERYNALKANYLRQI*NGKDFYAFP 
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SEQ2656 
SEQ2857 
SEQ285B 
SEQ2859 
SEQ2860 

SEQ2850 
SEQ2851 
SEQ2852 
SEQ2853 
SEQ2854 
SEQ2855 
SEQ2856 
SEQ2857 
SEQ2858 
SEQ2859 
SEQ2860 

SEQ2850 
SEQ2851 
SEQ2852 
SEQ2853 
SEQ2854 
SEQ2855 
SEQ2856 
SEQ2857 
SEQ2858 
SEQ2859 
SEQ2860 

SEQ2850 
SEQ2851 
SEQ2852 
SEQ2853 
SEQ2854 
SEQ2855 
SEQ2856 
SEQ2857 
SEQ2858 
SEQ2859 
SEQ2860 



Table 28: C mparativ Sequences relating to SAG1S52 
(conserved hypothetical protein) 

PKSGSRKMNGSKVTFSKSSD1TO.SIDFN^^ 
PKSGSRK^GSKVTFSKSSDFVLSIDPNGKSE 

PKSGSRKMNGSKWFSKSSDFVLSIDPNGKSELFVQERYNALKANYLRQLNGKDFYAFP 
PKSGSRKMNGSKVTFSKSSDFVLSIDPNGKSELFVQERYNALKANYLRQLNGKDFYAFP 

KKNSSNFEQINMVLRNTKIVEDMEKVKATERFLPTHPTGLLCTGTIDRHQKTFDSQTDI 
KKNSSNFEQINMVLRNTKIVEDMEKVKATERFLPTHPTGLLKTGTTORHQKTFDSQTDI 
KKNSSNFEQINMVLRNTKIVEDMEKVKATERFLPTHPTGLLKTGTTDRHQKTFDSQTDI 
KKNSSNFEQINMVLRNTKIVEDMEKVKATERFLPTHPTGLLKTGTTDRHQKTFDSQTDI 
KKNSSNFEQINMVLRNTKIVEDMEKVKATERFLPTHPTGLLKTGTTDRHQKTFDSQTDI 
KKNSSNFEQINMVLRNTKIVEDMEKVKATERFLPTHPTGLLKTGTTDRHQKTFDSQTDI 
KKN S SN FEQINMVLRNTKI VEDMEKVKATERFLPTHPTGLLKTGTT DRHQKT ^^^^^ 

^SSNFEQINMVLRNTKIVEDMEKVKATERFLPTHPTGLLKTGTTDRHQKTFDSQTDI 
KKNSSNFEQINMVLRNTKIVEDMEKVKATERFLPTHPTGIiLKTGTTDRHQKTFDSQTDI 
KKNSSNFEQIN^WLRNTKIVEDMEKVKATERFLPTHPTGLLKTGTTDRHQKTFDSQTDI 

FGKDFIEVRIPWQLLNFSDPSSQKIHDDYFKHYGVKELEIESIALGLGANSKENTLIKM 

FGKDFIEVRI PWQLLNFS DPS SQKIHDDYFKHYGVKELE IES IALGLGANSKENTLIKM 
FGKDFIEVRIPWQLLNFSDPSSQKIHDDYFKHYGVKELEIESIALGLGANSKENTLIKM 
FGKDFIEVRI PWQLLNFS DPS SQRIHDDYFKHYGVKELENE PLLDVLIAKKTHRWQI IV 
FGKDFIEVRI PWQLLN FSDPS SQRIHDDYFKHYGVKELENEPLLDVLIAKKTHRWQI IV 
FGKDFIEVRI PWQLLNFS D PS SQKIHDDYFKHYGVKELE IES IALGLGANSKENTLIKM 
FGKDFI EVRI PWQLLN FS DP SSQKI HDDYFKHYGVKELEIE S IALGLGANSKENTLIKM 

FGKDFIEVRI PWQLLN FS DPS SQKIHDDYFKHYGVKELE IES I ALGLGAN SKENTLIKM 
FGKDFI EVRI PWQLLNFSDPS SQKIHDDYFKHYGVKELE IES I ALGLGANSKENTLIKM 
FGKDFIEVRIPWQLLNFSDPSSQKIHDDYFKHYGVKELEIESIALGLGANSKENTLIKM 

DYRLKNWERPDTKTFLKDSYYSIER 

DYRLKNWERPDTKTFLKDSYYVLRK 

DYRLKNWERPDTKTFLKDSYYSIKKEWSKERERTYGP 

IGRDPI PKPFK 

IGRDPIPKPFK 

DYRLKNWERPDTKTFLKDSYYVLRK 

DYRLKNWERPDTKTFLKD 

DYRLKNWERPDTKT FLKDSY YS IKK 

DYRLKNWERPDTKTFLKDSYYSIK 

DYRLKNWERPDTKT FLKDSYYSIKKEW 
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WTable 29: Comparative Sequences relating to SAG1641 (YaeC famWprotein) 

^aSSt^ataaaa^^ 

GflAAATAMAAAAACTTfi^TTCCACTTGAAAAGACTTACTTAGC^ 

StISaSS 
ScaS^^ 

SBQ ID NO. 2902: SAG1641 FROM THE 1169NT1 GBS TYPE V STRAIN 



AAAATAAGAAAAACTTAATTCCACTTGAAAAGACTTA< 

acaIXSaS^^ 

CACACAGATGAAGTGAAAAAAGTTATCAAAGATACTTCAGCTGATATTCCAC 

cacaSgatgaagtgaaaaaagttatcaaagatacttcagctgatattccacaatgg 

^^Sssssssssssssss^ssssssss 

cacacagatgaagtgaaaaaagttatcaaagatacttcagctgatattccacaatgg 
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Table 29: Comparative Sequences relating to SAG1641 (YaeC fa: 




rotein) 



SEQ^T NO. 2906: SAG1641 FROM THE CJB110 GBS NONTYPEABLE STRAIN 

AAGTAAAGTTGTTAAAGTTGGTGTTATGACCTTTTCTGACACTGAAAAAGCACGTTGGGATAAAATTGAAAAGCTAGT 

AGGCGATAAAGCTAAAATCAAATTCACAGAATTTACAGATTATACACAACCAAATCAAGCGACAGCCAATAAGGATG 

GGATATTAATGCCTTTCAACATTACAATTTCTTAGAAAACTGGAATAAGGAAAATAAG7VAAAACTTAATTCCACTTGA 

AAAGACTTACTTAGCCCCAATTCGTATCTATTCTGAGAAGGTAAAATCTCTTAAAAAATTGAAAAAAGGAGCCACTAT 

TGCAATTCCAAATGATGCAACAAATGGTAGCCGTGCATTGTATGTCCTTCAGTCAGCAGGTTTAATCAAATTGAATGT 

TTCTGGTAAGAAGGTTGCAACAGTTGCTAATATCACATCTAATAAAAAAGATATTAATATTCAGGAGTTAGATGCGAG 

TCAAACACCACGTGCACTCAAAGATGTAGATGCAGCTATTATTAATAATACATACATTGAGCAAGCTAATTTAAAACC 

TTCAGATGCT ATCTTTGTTGAGAAAT CAGAT AAAAATTCAAAACAAT GGATTAATATCAT TGCGGGACGTAAAAATT G 

GAAAAAGCAAAAGAACGCTAAAGCTATCCAAGCTATCTTGGATGCTTATCACACAGATGAAGTGAAAAAAGTTATCAA 

AGATACTTCAGCTGATATTCCACAATGGAA 

SEQ ID NO. 2907: SA6X641 FROM THE COH1 GBS TYPE III STRAIN 
(REVERSE COMPLEMENT) 

AGTTTCAGCAAGCTCAACTTCAAGTAAAGTTGTTAAAGTTGGTGTTATGACCTTTTCTGACACTGAAAAAGCACGTTG 
GGATAAAATTGAAAAGCTAGT AGGTGATAAAGCTAAAATCAAATT TACAGAAT TTACAGATT AT ACACAACCAAATC A 
AGCGACAGCCAATAAGGATGTGGATATTAATGCCTTTCAACATTACAATTTCTTAGAAAACTGGAATAAGGAAAATAA 
GAAAAACTTAATTCCACTTGAAAAGACTTACTTAGCTCCAATTCGTATCTATTCTGAGAAGGTT^AAATCTCTTAAAAA 
ATTGAAAAAAGGAGCCACTATTGCAATTCCAAATGATGCAACAAATGGTAGCCGTGCATTGTATGTACTTCAGTCAGC 
AGGTTTAATCAAATTGAATGTTTCTGGTAAGAAGGTTGCAACAGTTGCTAATATCACATCTAATAAAAAGGATATTAA 
TATTCAGGAGTTAGATGCGAGTCAAACACCACGTGCACTCAAAGATGTAGATGCAGCTATTATTAATAATACATACAT 
TGAGCAAGCT AATTTAAAACCTTC AGATGCTATCTTTGTTGAGAAATCAGATAAAAAT TCAAAACAATGGAT TAATAT 
CATTGCGGGACGTAAAAATTGGAAAAAGCAAAAGAACGCTAAAGCTATCCAAGCTATCTTGGATGCTTATCACACAGA 
TGAAGTGAAAAAAGTTATCAAAGATACTTCAGCTGATATTCCACAATGG 

SEQ ID NO. 2908: SAG1641 FROM THE H36b GBS TYPE It> STRAIN 

AAGAAGTTTCAGCAAGCTCAACTTCAAGTAAAGTTGTTAAAGTTGGTGTTATGACCTTTTCTGACACTGAAAAAGCAC 
GTTGGGATAAAATTGA7UVAGCTAGTAGGTGATAAAGCTAAAATCAAATTTACAGAATTTACAGATTATACACAACCAA 
ATCAAGCGACAGCCAATAAGGATGTGGATATTAATGCCTTTCT^ACATTACAATTTCTTAGAAAACTGGAATAAGGAAA 
ATAAGAAAAACTTAATTCCACTTGAAAAGACTTACTTAGCTCCAATTCGTATCTATTCTGAG7UVGGTAAAATCTCTTA 
AAAAATTGAAAAAAGGAGCCACTATTGCAATTCCAAATGATGCAACAAATGGTAGCCGTGCATTGTATGTCCTTCAGT 
CAGCAGGTT TAATCAAATTGAATGTTTCTGGTAAGAAGGTTGCAACAGTTGCTAATATCACAT CTAAT AAAAAGGATA 
TTAATATTCAGGAGTTAGATGCGAGTCAAACACCACGTGCACTCAAAGATGTAGATGCAGCTATTATTAATAATACAT 
ACATTGAGCAAGCTAATTTAAAACCTTCAGATGCTATCTTTGTTGAGAAATCAGATAAAAATTCAAAACAATGGATTA 
ATATCATTGCGGGACGTAAAAATTGGAAAAAGCAAAAGAACGCTAAAGCTATCCAAGCTATCTTGGATGCTTATCACA 
CAGATGAAGTGAAAAAAGTTATCAAAGATACTTCAGCTGATATTCCACAATGG 

SEQ ID NO. 2909: SAG1641 FROM THE 0M3190013 GBS TYPE VIII STRAIN 

TTCAGCAAGCTCAACTTCAAGTAAAGTTGTTAAAGTTGGTGTTATGACCTTTTCTGACACTGAAAAAGCACGTTGGGA 

TAAAATTGAAAAGCTAGTAGGTGATAAAGCTAAAATCAAATTTAC^GAATTTAC^ 

GACAGCCAATAAGGATGTGGATATTAATGCCTTTCAACATTACAATTTCTTAGAAAACTGGAATAAGGAAAATAAGAA 
AAACTTAATTCC^CTTGAAAAGACTTACTTAGCTCCAATTCGTATCTATTCTGAGAAGGTAAAATCTCTTAAAAAATT 
GAAAAAAGGAGCCACTATTGCAAT TCC AAATGATGCAACAAAT GGTAGCCGTGCATTGTATGTCCTTCAGTCAGCAGG 
TTTAATCAAATTGAATGTTTCTGGTAAGAAGGTTGCAACAGTTGCTAATATCACATCTAATAAAAAGGATATTAATAT 
TCAGGAGTTAGATGCGAGTCAAACACCACGTGCACTCAAAGATGTAGATGCAGCTATTATTAATAATACATACATTGA 
GCAAGCTAATTTAAAACCTTCAGATGCTATCTTTGTTGAGAAATCAGATAAAAATTCAAAACAATGGATTAATATCAT 
TGCGGGACGTAT^AAATTGGAAAAAGCAAAAGAACGCTAAAGCTATCCAAGCTATCTTGGATGCTTATCACACAGATGA 
AGTGAAAAAAGTTATCAAAGATACTTCAGCTGATATTCCACAATGG 

SEQ ID NO. 2910: SAG1641 FROM THE M732 GBS TYPE III STRAIN 

AATCAAGAAGTTTCAGCAAGCTCAACTTCAAGTAAAGTTGTTAAAGTTGGTGTTATGACCTTTTCTGACACTGAAAAA 
GCACGTTGGGATAAAATTGAAAAGCTAGTAGGTGATAAAGCTAAAATCAAATTTACAGAATTTACAGATTATACACAA 
CCAAATCAAGCGACAGCCAATAAGGATGTGGATATTAATGCCTTTCAACATTACAATTTCTTAGAAAACTGGAATAAG 
GAAAATAAGAAAAACTTAATTCCACTTGAAAAGACTTACTTAGCTCCAATTCGTATCTATTCTGAGAAGGTAAAATCT 
CTTAAAAAATTGAAAAAAGGAGCCACTATTGCAATTCCAAATGATGCAACAAATGGTAGCCGTGCATTGTATGTCCTT 
CAGTCAGCAGGTTTAATCAAATTGAATGTTTCTGGTAAGAAGGTTGCAACAGTTGCTAATATCACATCTAATAAAAAG 
GATATTAATATTCAGGAGTTAGATGCGAGTCAAACACCACGTGCACTCAAAGATGTAGATGCAGCTATTATTAATAAT 
ACATACATTGAGCAAGCTAATTTAAAACCTTCAGATGCTATCTTTGTTGAGAAATCAGATAAAAATTCAAAACAATGG 
ATTAATATCATTGCGGGACGTAAAAATTGGAAAAAGCAAAAGAACGCTAAAGCTATCCAAGCTATCTTGGATGCTTAT 
CACACAGATGAAGTGAAAAAAGTTATCAAAGATAC 
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«afale 29: Comparative Sequences relating to SAG1641 (YaeC famW protein) 
an oaiv SAG1641 FROM THE M781 GBS TOTE III STRAIN 

TGAAGTGAAAAAAGTTATCAAAGATACTTCAGCTGATATTCCACAATGG 

SK02901 ATCAAGAAGTTTCAGCAAGCTCARCTTCAAGTAAAGTTGTTAAAGTTGGTGTTATGACC 

ATCAAGi^GTTTCAGCAAGCTCAACTTCAAGTAAAGT^ 
SE «2902 ATraRTAAGTTTCAGCA^GCTCAACTTCAAGTAAAGTTGTTAAAGTTGGT 

£11111 ^^^-^AGTTTCAGCAAGCTC^^CTTCAAGTAAAGTTGTTAAAGTTGGTGTTATGACC 

«™«o oni TTTTCTGACACTGAAAAAGCACGTTGGGATAAAATTGAAAAGCTAGTAGGCGATAAAGCT 
OTTTCTGACACTGAAAARGCACGTTGGGATAAAATTGAAAAGCTAGTAGGTGATAAAGCT 

iiTTCTGACACTGAAAAAGCACGTTGGGATAAAATTGAAAAGCTAGTAGGTGATAAAGCT 
^«"04 OTTTCTGAC^CTGAA^AAGCACGTTGGGATAAAATTGAAAAGCTAGTAGGTGATAAAGCT 

I™^7 ttttctgacactgaaaaagcacgttgggataaaattgaaaagctagtaggtgataaagct 
«o!oft« ttttctgaSct^^aaaagcacgttgggataaaattgaaaagctagtaggtgata?\agct 

£S£ AAAATCAAATOTACAGAATTT^ 

ffiSSol AAAATCAAATTCACAGAATTTACAGATTATACACAACCAAATCAAGCGACAGCCAATAAG 

~SSo? AAAATCAAATTT ACAGAATTT ACAGATT AT ACACARCCAAAT '"^^^^^Sonan^nnr 

ll^ljl AAAATCAAATTTACAC^TTTAC^GATTATACACAACCAAATCAAGCGACAGCCAATAAG 

«£ono AAAATCAAATTTACAGAATTTACAG 

£g™ S^Sacagaatttacagattatacacaaccaaatcaag 
^Q"02 gmotSatattaatgcctttcaacatta™ 

Hallos ^S^MAACTTAATTCCACTTGAAAAQACTTACTTAGCTCC&ATT CGTATCTATTCTGAG 

AMAAARACTTAATTCCACTTGAAAAGACTTACT^AGCCCCAATTCGTA 

SIS? a^a^Saattccacwgaaargacttactt 
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Table 29: Comparative Sequences relating to SAG1641 (YaeC fa 



protein) 



SEQ2 
SEQ2910 
SEQ2911 

SEQ2901 
SEQ2902 
SEQ2903 
SEQ2904 
SEQ2905 
SEQ2906 
SEQ2907 
SEQ2908 
SEQ2909 
SEQ2910 
SEQ2911 

SEQ2901 
SEQ2902 
SEQ2903 
SEQ2904 
SEQ2905 
SEQ2906 
SEQ2907 
SEQ2908 
SEQ2909 
SEQ2910 
SEQ2911 

SEQ2901 
SEQ2902 
SEQ2903 
SEQ2904 
SEQ2905 
SEQ2906 
SEQ2907 
SEQ2908 
SEQ2909 
SEQ2910 
SEQ2911 

SEQ2901 
SEQ2902 
SEQ2903 
SEQ2904 
SEQ2905 
SEQ2906 
SEQ2907 
SEQ2908 
SEQ2909 
SEQ2910 
SEQ2911 

SEQ2901 
SEQ2902 
SEQ2903 
SEQ2904 
SEQ2905 
SEQ2906 
SEQ2907 
SEQ2908 
SEQ2909 
SEQ2910 
SEQ2911 



AAGAAAAACTTAATTC(^CTTGAAAAGACTTACTTAGCTCCAATTCGTATCTATTCTGAG 
AAGAAAAACTTAATTCCACTTGAAAAGACTTACTTAGCTCCAATTCGTATCTATTCTGAG 
AAGAAAAACTTAATTCCACTTGAAAAGACTTACTTAGCTCCAATTCGTATCTATTCTGAG 

AAGGTAAAATCTCTTAAAAAATTGAAAAAAGGAGCCACTATTGCAATTCCAAATGATGCA 
AAGGTAAAATCTCTTAAAAAATTGAAAAAAGGAGCCACTATTGCAATTCCAAATGATGCA 
AAGGTAAAATCTCTTAAAAAATTGAAAAAAGGAGCCACTATTGCAATTCCAAATGATGCA 
AAGGTAAAATCTCTTAAAAAATTGAAAAAAGGAGCCACTATTGCAATTCCAAATGATGCA 
AAGGTAAAATCTCTTAAAAAATTGAAAAAAGGAGCCACTATTGCAATTCCAAATGATGCA 
AAGGTAAAATCTCTTAAAAAATTGAAAAAAGGAGCCACTATTGCAATTCCAAATGATGCA 
AAGGTAAAATCTCTTAAA2\AATTGAAAAAAGGAGCCACTATTGCAATTCCAAATGATGCA 
AAGGTAAAATCTCTTT^AAAAATTGAAAAAAGGAGCCACTATTGCAATTCCAAATGATGCA 
AAGGTAAAATCTCTTAAAAAATTGAAAAAAGGAGCCACTATTGCAATTCCAAATGATGCA 
AAGGTAAAATCTCTTAAAAAATTGAAAAAAGGAGCCACTATTGCAATTCCAAATGATGCA 
AAGGTAAAATCTCTTAAAAAATTGAAAAAAGGAGCCACTATTGCAATTCCAAATGATGCA 

ACAAATGGTAGCCGTGCATTGTATGTCCTTCAGTCAGCAGGTTTAATCAAATTGAATGTT 
ACAAATGGTAGCCGTGCATTGTATGTCCTTCAGTCAGCAGGTTTAATCAAATTGAATGTT 
ACAAATGGTAGCCGTGCATTGTATGTCCTTCAGTCAGCAGGTTTAATCAAATTGAATGTT 
ACAAATGGTAGCCGTGCATTGTATGTCCTTCAGTCAGCAGGTTTAATCAAATTGAATGTT 
ACAAATGGTAGCCGTGCATTGTATGTCCrrCAGTCAGCAGGTTTAATCAAATTGAATGTT 
ACAAATGGTAGCCGTGCATTGTATGTCCTTCAGTCAGCAGGTTTAATCAAATTGAATGTT 
ACAAATGGTAGCCGTGCATTGTATGTACTTCAGTCAGCAGGTTTAATCAAATTGAATGTT 
ACAAATGGTAGCCGTGCATTGTATGTCCTTCAGTCAGCAGGTTTAATCAAATTGAATGTT 
ACAAATGGTAGCCGTGCATTGTATGTCCTTCAGTCAGCAGGTTTAATCAAATTGAATGTT 
ACAAATGGTAGCCGTGCATTGTATGTCCTTCAGTCAGCAGGTTTAATCAAATTGAATGTT 
ACAAATGGTAGCCGTGCATTGTATGTCCTTCAGTCAGCAGGTTTAATCAAATTGAATGTT 

TCXGGTAAGAAGGTTGCAACAGTTGCTAATATCACATCTAATAAAAAAGATATTAATATT 
TCTGGTAAGAAGGTTGCAACAGTTGCTAATATCACATCTAATAAAAAGGATATTAATATT 
TCTGGTAAGAAGGTTGCAACAGTTGCTAATATCACATCTAATAAAAAGGATATTAATATT 
TCTGGTAAGAAGGTTGCAACAGTTGCTAATATCACATCTAATAAAAAGGATATTAATATT 
TCTGGTAAGAAGGTTGCAAGAGTTGCTAATATCACATCTAATAAAAAGGATATTAATATT 
TCTGGTAAGAAGGTTGCAACAGTTGCTAATATCACATCTAATAAAAAAGATATTAATATT 
TCTGGTAAGAAGGTTGCAACAGTTGCTAATATCACATCTAATAAAAAGGATATTAATATT 
TCTGGTAAGAAGGTTGCAACAGTTGCTAATATCACATCTAATAAAAAGGATATTAATATT 
TCTGGTAAGAAGGTTGCAACAGTTGCTAATATCACATCTAATAAAAAGGATATTAATATT 
TCTGGTAAGAAGGTTGCAACAGTTGCTAATATCACATCTAATAAAAAGGATATTAATATT 
TCTGGTAAGAAGGTTGCAACAGTTGCTAATATCACATCTAATAAAAAGGATATTAATATT 

CAGGAGTTAGATGCGAGTCAAACACCACGTGCACTCAAAGATGTAGATGCAGCTATTATT 
C^GGAGTTAGATGCGAGTCAAACACCACGTGCACTCAAAGATGTAGATGCAGCTATTATT 
CAGGAGTTAGATGCGAGTCAAACACCACGTGCACTCAAAGATGTAGATGCAGCTATTATT 
CAGGAGTTAGATGCCAGTCAAACACCACGTGCACTCAAAGATGTAGATGCAGCTATTATT 
CAGGAGTTAGATGCGAGTCAAACACCACGTGCACTCAAAGATGTAGATGCAGCTATTATT 
CAGGAGTTAGATGCX3AGTGAAACACCACGTGCACTCAAAGAT 

CAGGAGTTAGATGCGAGTCAAACACCACGTGCACTCAAAGATCTAGATGCAGCTATTATT 
CAGGAGTTAGATGCGAGTCAAACACCACGTGGACTCAAAGATGTAGATGCAGCTATTATT 
CAGGAGTTAGATGCGAGTCAAACACCACGTGCACTCAAAGATGTAGATGCAGCTATTATT 
CAGGAGTTAGATGCGAGTCAAACACCACGTGCACTCAAAGATGTAGATGCAGCTATTATT 
CAGGAGTTAGATGCGAGTCAAACACCACGTGCACTCAAAGATGTAGATGC^GCTATTATT 

AATAATACATACATTGAGCAAGCTAATTTAAAACCTTCAGATGCTATCTTTGTTGAGAAA 
AATAATACATACATTGAGCAAGCTAATTTAAAACCTTCAGATGCTATCTTTGTTGAGAAA 
AATAATACATACATTGAGCAAGCTAATTTAAAACCTTCAGATGCTATCTTTGTTGAGAAA 
AATAATACATACATTGAGCAAGCTAATTTAAAACCTTCAGATGCTATCTTTGTTGAGAAA 
AATAATACATACATTGAGCAAGCTAATTTAAAACCTTCAGATGCTATCTTTGTTGAGAAA 
AATAATACATACATTGAGCAAGCTAATTTAAAACCTTCAGATGCTATCTTTGTTGAGAAA 
AATAATACATACATTGAGCAAGCTAATTTAAAACCTTCAGATGCTATCTTTGTTGAGAAA 
AAT7UVTACATACATTGAGCAAGCTAATTT7UUVACCTTCAGATGCTATCTTTGTTGAGAAA 
AATAATACATACATTGAGCAAGCTAATTTAAAACCTTCAGATGCTATCTTTGTTGAGAAA 
AATAATACATACATTGAGCAAGCTAATTTAAAACCTTCAGATGCTATCTTTGTTGAGAAA 
AATAATACATACATTGAGCAAGCTAATTTAAAACCTTCAGATGCTATCTTTGTTGAGAAA 
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■ —Tabic 29: Comparative Sequences relating to SAG1641 (YaeC bmW>rotein) 

SgJSS T^AGRT^^ATTCAAAACAATGGATTARTATCATTGCGGGACGTAARftATTGGAARAAG 
Sollofi T^GATAAA^TTCARAACAATGGATTAATAT(^TTGCGGGACGTARAAATTGGiU\AAAG 
llollll TCA^TAAA^TTCAAAACAATGGATTAATATCATTGCGGGACGTAAAAATTGGAAARRG 

Sloe ?SS?aaaaa™^ 

llollol t^gataaaaattcaaaacaatggattaatatcattgcgggacgtaaaaattggaaaaag 
Slio tcagataaaaattcaaaacaatggattaatatcattgcgggacgtaaa^ 
£5m tcagataaaaatt 

SBO2901 caaaagaacgctaaagctatccaagctatcttggatgcttatcacacagatgaagtgaaa 
caaaagaacgctawvgctatccaagctatcttggatgcttatcacacagatgaagtgaaa 
£*!ra Sma^gaacgctaaagctatccaagctatcttggatgcttatcacacagatgaagtcaaa 
£»m Saaagaacgctaaagctatccaagctatcttggatgcttatcacaca^ 
«S;!ot caaaagaacgctaaagctatccaagctatcttggatgcttatcacacagatgaagtgaaa 
caaaagaacgctaaagctatccaagctat 

11111m caaaagaacgctawusctatccaagcta^ 

IeoS caaaagaacgcta^gctatccaagctatcttggatgcttatcacacagatgaagtgaaa 

SEO2901 aaagttatcaaagatacttcagctgatattccacaatggaacccagctttcttgtacaa 
SEO2902 aaagttatcaaagatacttcagctgatattccacaatgg 

SEQ2903 AAAGTTATCAAAGATACTTCAGCTGATATTCCAC----- -~~ 

SEO2904 AAAGTTATCAAAGATACTTCAGCTGATATTCCACAATGG 

SEQ2905 AAAGTT ATCAAAGATACTTCAGCTG ATATT CCACAATGG- 

SEO2906 AAAGTTATCAAAGATACTTCAGCTGATATTCCACAATGGAA 

• SEO2907 AAAGTTATCAAAGATACTTCAGCTGATATTCCACAATGG 

SEO2908 AAAGTTATCAAAGATACTTCAGCTGATATTCCACAATGG 

SBQ2909 AAAGTTATCAAAGATACTTCAGCTGATATTCCACAATGG-------- """"" 

SEQ2910 AAAGTTATCAAAGATAC ■ """" " ~ 

SEQ2911 AAAGTTATCAAAGATACTTCAGCTGATATTCCACAATGG 

>SEO ID HO 2950* 35 090 frame: 1 

NOEVSASSTSSKVVKVGVIOTFSDTEKfiRWDKIEKXiVGDlCAKIKFTEFTDYTQPNQATAllK 
TOGSRAL^QSAGLIKLtWSGKKVATVANITSNKKDINIQELDASQTPFU^VDM 

SieS^psdaifveksdknskqwihiiagbk^qknakaiqaildayhtdevk 

KVIKDTSADI PQWNPAFLY 

>SEO ID MO 2951: 35 1169HT frame: 3 ' 
QEVSASSTSSKWKVGVMTE^SDTEKARWDKIEKLVGDK^ 

VDINAFQHYNFIJ^KENKKNLIPU^ 

ngsiu,lyVlqsagliki^sgkkvatvanitsnkkdiniqeldasqtppai^v^ito 

NTY:TCQANIiCTSDAIFVEKSDKNSK<WINIIAGR 
VIKDTSADIPQW 

>SEQ ID NO 2952: 35 18BS21 frame: 1 

■TNGSP^Y^QSAGLIKLWSGK^^^ 

NNTYIEQANIjKPSDAI fveks dkn SKQW ini i agrkn wkkqknakaiqai ldayhtdevk 

KVIKDTSADI P 

>SEQ ID HO 2953*35 2603 frame: 1 

.nqevsasstsskvvkvgvmtfsdtekarw^^^ 

DVDINAroHYNFLENWNKENKKNLIPLEKTYLAPIRIYSEK 

TNGSPALYVliQSAGLIKLl^SGKKVATVANITSNKKDINIQEUDASQTPRA^DVDAM 
NNTYIEOANLKPSDAIFVEKSDKNSKQWINIIAGRKNWKKQKNAKAIQAIIiDAYriTDEVK 

KVIKDTSADIPQW 
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Table 29: C mparativ Sequences relating to SAG1641 (YaeC faWf protein) 

>S^(^ NO 2954:35_A909 frame: 1 

NQEVS AS STSS KWKVGVMTFS DTEKARWDKIEKLVGDKAKIKFTEFTDYTQPN QATANK 
DVDINAFQHYN FLENWNKENKKNLI PLEKTYLAPIRIYSEKVKSLKKLKKGAT IAI PNDA 
TNGSRALYVIiQSAGLIKI^SGKKVATVANITSNKKDINIQELDASQTPRAI^DVD 
NNTYIEQANLKPSDAI FVEKS DKN SKQWINI IAGRKNWKKQKNAKAIQAI LDAYHTDEVK 
KVIKDTSADI PQW 

>SEQ ID MO 2955:35 CJB110 frame: 2 

skwkvgvmt fs dtekarwdkieklvgdkakikfte ftdytqpnqatankdvdinafqhy 

nfi^nwnkenkknliplektyiapiriysekvkslkklkkgatiaipnda™ 

qsagliklwsgkkvatvanitsnkkdiniqeldasqtpralkdvdaaiinntyieqanl 

KPSDAIFVEKSDKNSKQWINIIAGRKNWKKQKNAKAIQAILDAYHTDEVKKVIKDTSADI 
PQW 

>SEQ ID NO 2956:35jCOHl frame: 2 

VSASSTSSKWKVGVMTFSDTEKARWDKIEKLVGDKAKIKFTEFTDYTQPNQATANKDVD 
INAFQHYN FLENWNKENKKNLI PLEKTYLAPIRI YSEKVKSLKKLKKGAT IAI PN DATNG 
SRALYVLQS AGLIKLNVS GKKVATVAN ITSNKKDINIQELDASQTPRALKDVDAAI INNT 
YIEQANLKPS DAI FVEKS DKNSKQW IN I IAGRKNWKKQKNAKAIQAIL DAYHTDEVKKVI 
KDTSADIPQW 

>SEQ ID KO 2957:35_H36B frame: 3 

EVSAS STS SKWKVGVMT FS DTEKARWDKIEKLVGDKAKIKFTEFT DYTQFNQATANKDV 
DINAFQHYN FLENWNKENKKNLI PLEKTYLAPI RI YSEKVKSLKKLKKGAT IAI PNDATN 
GS RALYVLQSAGLIKLNVSGKKVATVAN ITSNKKDIN IQELDASQT PRALKDVDAAI INN 
T YIEQANLKPSDAI FVEKSDKNS KQW IN I IAGRKNWKKQKNAKAIQAI LDAYHTDEVKKV 
IKDTSADIPQW 

>SEQ ID NO 2958:35_JM9130013 frame: 2 

SAS STS SKWKVGVMT FS DTEKARWDKIEKLVG DKAKI KFTEFTDYTQPNQATANKDVDI 
NAFQHYNFLENWNKENKKNLI PLEKTYLAPIRIYSEKVKSLKKLKKGAT IAI PN DATNGS 
RALYVLQSAGLIKLNVSGKKVATVANITSNKKDINIQELDASQTPRALKDVDAAIINNTY 
IEQANLKPS DAIFVEKSDKNSKQWINI IAGRKNWKKQKNAKAIQAILDAYHTDEVKKVIK 
DTSADIPQW 

>SEQ ID NO 2959:35_M732 frame: 1 

NQEVSAS STS SKWKVGVMTFSDTEKARWDKIEKLVGDKAKIKFTEFTDYTQPNQATANK 
DVDINAFQHYN FLENWNKENKKNLI PLEKTYLAPIRI YSEKVKSLKKLKKGAT IAI PNDA 
TNGSRALYVLQSAGLIKLNVS GKICVATVANIT SNKKDIN IQELDASQT PRALKDVDAAI I 
NNTYIEQANLKPS DAI FVEKS DKN SKQWINI I AGRKNWKKQKNAKAIQAILDAYHTDEVK 
KVIKD 

>SEQ ID NO 2960:35_M781 frame: 2 

VSASSTSSKWKVGVMT FSDTEKARWDKIEKLVGDKAKIKFTEFT DYTQPNQATANKDVD 
INAFQHYNFIiENWNKENKKNLI PLEKTYLAP IRIYSBKVKSLKKLKKGATI AI PNDATNG 
SRALYVLQSAGLIKLNVSGKKVATVANITSNKKDINIQELDASQTPRALKDVDAAIINNT 
YIEQANLKPS DAI FVEKS DKNSKQW IN I IAGRKNWKKQKNAKAIQAIWDAYHTDEVKKVI 
KDTSADIPQW 

SEQ2950 QEVSASSTSSKVVKVGVMTFSDTEBCARWDKIEKLVGDBCAKIKFTEFTDYTQPNQATANK 
SEQ2951 QEVSAS ST S SKWKVGVMTFS DTEKARWDKIEKLVGDKAKIKFTEFT DYTQPNQATANK 

SEQ2952 QEVS ASSTS SKWKVGVMTFS DTEKARW DK IEKLVG DKAKI KFTE FT DYTQPNQATANK 

SEQ2953 QEVSAS STS SKWKVGVMTFS DTEKARWDKIEKLVGDKAKIKFTEFT DYTQPNQATANK 

SEQ2954 QEIVSAS ST S SKWKVGVMTFS DTEKARWDKIEKLVG DKAKI KFTE FTD YTQPNQAT AN K 

SEQ2955 SKWKVGVMTFS DTEKARWDKIEKLVG DKAKI KFTE FT DYTQPNQATANK 

SEQ2956 — VSAS STS SKWKVGVMTFS DTEKARWDKIEKLVG DKAKI KFTE FT DYTQPNQATANK 
SEQ2957 -EVSASSTSSKWKVGVMTFSDTEKARWDKIEKLVGDKAKIKFTEFTDYTQPNQATANK 

SEQ2958 SASSTSSKWKVGVMTFSDTEKARWDKIEKLVGDKAKIKFTEFTDYTQPNQATANK 

SEQ2 959 QEVSASSTSSKWKVGVMTFSDTEKARWDKIEKLVGDKAKIKFTEFTDYTQPNQATANK 
SEQ2960 — VSAS STSSKWKVGVMTFS DTEKARWDKIEKLVGDKAKIKFTE FT DYTQPNQATANK 

SEQ2950 DVDINAFQHYNFLENWNKEINKKNLI PLEKTYLAPIRI YSEKVKSIiKKLKKGATIAI PNDA 

SEQ2 951 DVDINAFQHYNFLENWNKENKKNLIPLEKTYLAPIRI YSEKVKSLKKLKKGAT IAI PNDA 

SEQ2952 DVDINAFQHYNFLENWNKENKKNLIPLEKTYLAPIRIYSEKVKSLKKLKKGATIAIPNDA 

SEQ2953 DVDINAFQHYNFLENWNKENKKNLIPLEKTYI*APIRIYSEKVKSLKKLKKGATIMPNDA 

SEQ2954 DVDINAFQHYN FIJ3NWNKENKKNLI PLEKTYLAPIRI YSEKVKSLKK^ 

SEQ2955 DVDINAFQHYN FLENWNKENKKNLI PLEKTYLAPIRI YSEKVKSLKKLKKGATIAI PNDA 

SEQ2 956 DVDINAFQHYN FLENWNKENKKNLI PLEKTYLAPIRI YSEKVKSLKKLKKGATIAI PNDA 

SEQ2957 DVDINAFQHYN FLENWNKENKKNLIPLEKTYLAPIRIYSEKVKSLKKLKKGAT IAI PNDA 

SEQ2958 DVDINAFQHYN FLENWNKENKKNLI PLEKTYLAPIRI YSEKVKSLKKLKKGAT IAI PNDA 
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^Table 29: Comparative Sequences relating to SAG1641 (YaeC fai^tf protein) 

pyDiNAFQHYNFLENWNKENKKNLIPLEKTYLAPIRJYSEKVKSLKK^ 

SEQ29 60 DVDI^AFQHYNFLENWNKENKKNLIPLEKTYIAPIRIYSEKVKSLKKLKKGATIAI PNDA 

SEQ2 950 TNGSRALYVLQSAGLIKLWSGKKVATVANITSNKKDINIQELDASQTPRALKDVDAAII 

SBQ2951 TNGSRALYVLQSAGLIKLNVSGKKVATVANITSNKKDINIQBLDASQTPRALKDVDAAII 

SEQ2952 *JNGSRALYVLQSAGLIKLNVS GKKVATVANIT SNKKDIN IQELDASQT PRALKDVDAAI I 

SEQ29S3 TNGSRALYVLQSAGLIKLNVSGKKVATVAN IT SNKKDINIQELDASQTPRALKDVDAAI I 

SEQ2954 TN GSRALYVMSAGLIKLNVSGKKVATVANITSNKKDINIQELDASQTPRALKDVDAAII 

SEQ2955 TNGSRALYVLQS AGLIKLNVSGKKVATVANIT SNKKD IN IQELDASQTPRALKDVDAAI I 

SEQ2956 tngSRALYVI^SAGLIKLNVSGKKVATVANITSNKKDINIQELDASQTPRALKDVDAAII 

SEQ2957 togSRALYVLQSAGLIKLNVSGKKVATVANITSNKKDINIQELDASQTPRALKDVDAAII 

SEQ2958 TNGSRALYVLQSAGLIKLNVSGKKVATVANITSNKKDINIQELDASQTPRALKDVDAAII 

SEQ2959 TNGSRALYVLQS AGLIKLNVS GKKVATVAN ITSNKKDIN IQELDASQT PRALKDVDAAI I 

SEQ2960 TNGSRALYVLQSAGLIKLNVSGKKVATVANITSNKKDINIQELDASQTPRALKDVDAAII 

SEO2950 NNTYIEQANLKPSDAI FVEKS DKN SKQWINI IAGRKNWKKQKNAKAIQAI LDAYHTDEVK 

SE02951 NNTYIEQANLKPSDAI FVEKS DKNSKQWINI I AGRKNWKKQKNAKAIQAILDAYHT DEVK 

SEQ2952 NNTYIEQANLKP SDAI FVEKS DKN SKQWIN II AGRKNWKKQKNAKAIQAI LDAYHTDEVK 

SE02953 NNTYIEQANLKPS DAI FVEKS DKN SKQW INI IAGRKNWKKQKNAKAIQAI LDAYHTDEVK 

SEQ2954 NNTYIEQANLKPS DAI FVEKS DKNSKQWINI IAGRKNWKKQKNAKAIQAI LDAYHT DEVK 

SE02955 NNTYIEQANLKP S DAI FVEKS DKN SKQWINI I AGRKNWKKQKNAKAIQAILDAYHT DEVK 

SE02956 NNTYIEQANLKPSDAI FVEKS DKN SKQW IN 1 1 AGRKNWKKQKNAKAIQAILDAYHT DEVK 

SE02957 NNTYIEQANLKPSDAIFVEKSDBCNSKQWINIIAGRKNWKKQKNAKAJQAILDAYHTDEVK 

SE02958 NNTYIEQANLKPS DAI FVEKS DKNSKQWINIIAGRKNWKKQKNAKAIQAILDAYHTDEVK 

SE02959 NNTYI EQANLKPS DAI FVEKS DKN SKQW IN 1 1 AGRKNWKKQKNAKAIQAILDAYHT DEVK 

SEQ2960 NNTYIEQANLKPSDAI EVEKSDKNSKQWINIIAGRKNWKKQKNAKAIQAIWDAYHTDEIVK 

SEQ2950 KVIKDTSADIPQWNPAFLY 

SEQ2951 KVIKDTSADIPQW 

SEQ2952 KVIKDTSADIP 

SEQ2953 KVIKDTSADIPQW 

SEQ2954 KVIKDTSADIPQW 

SEQ2955 KVIKDTSADIPQW 

SEQ2956 KVIKDTSADIPQW 

SEQ2957 KVIKDTSADIPQW 

SEQ2958 KVIKDTSADIPQW 

SEQ2959 KVIKD 

SEQ2960 KVIKDTSADIPQW 
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Table 30: Comparative Sequences relating to SAG214 j 
(protein f uknown function / lipoprotein, putative) 

SEQ ID NO. 3001: SAG2147 FROM THE 1169NT1 GBS TYPE V STRAIN 
(REVERSE COMPLEMENT) 

AAAAGTTCACAAGT TACT ACTGAAT CTTTGTCAAAAGCAGATAAAGTTCGCGTAGCC 
AAAAAATCAAAAATGACTAAGGCGACATCTAAATCAAAAGTAGAAGATGTAAAACAGGCT 
CCAAAACCTTCTCAGGCATCTAATGAAGTCtCAAAATCAAGTTCTCAATCTACAGAAGCT 
AATTCTCAGCAACAAGTTACTGCGAGTGAAGAGGCGGCTGTAGAACAAGCAGTTGTAACA 
GAAAATACCCCTGCTACCAGTCAGGCACAACAAACT TATGCTGTTACT GAGACAACTT AC 
AAACCTGCTCAACACCAGACAAGTGGCCAAGTATTGAGCAATGGAAATACTGCAGGGGCG 
GTCGGATCTGCTGCTGCAGCACAAATGGCTGCTGCAACAGGAGTCCCTCAGTCTACTTGG 
GAACATATTATTGCCCGTGAATCAAATGGTAATCCTAATGTTGCTAATGCCTCAGGAGCT 
TCAGGACTTTTCCAAACGATGCCAGGTTGGGGTTCAACAGCTACAGTTCAGGATCAAGTT 
AATTCAGCTATTAAAGCTTATCGTGCTCAAGGTTTATCAGCTTGGGGTTAC 

SEQ ID NO. 3002: SAG2147 FROM THE 18RS21 GBS TYPE II STRAIN 
(REVERSE COMPLEMENT) 

AAAAGTTCACAAGTTACTACTGAATCTTTGTCAAAAGCAGATAAAGTTC 
GCGTAGCCAAAAAATCAAAAATGACTAAGGCGACAJCCTAAATCAAAAGTAGAAGATGTAA 
AACAGGCTCCAAAACCTTCTCAGGCATCTAATGAAGCCCCAAAATCAAGTTCTCAATCTA 
CAGAAGCTAATTCTCAGCAACAAGTTACTGCGAGTGAAGAGGCAGCTGTAGAACAAGCAG 
TTGTAACAGAAAACACCCCTGCTACCAGTCAGGCACAACAAGCTTATGCTGTTACTGAGA 
CAACT TATAGACCTGCTCAACACCAGACGAGTGGCCAAGTATTGAGTAATG GAAATACTG 
CAGGGGCTATTGGCTCAGCAGCTGCAGCACAAATGGQTGCTGCAACAGGAGTCCCTCAGT 
CTACTTGGGAACATATTATTGCCCGTGAATCAAATGGTAATCCTAATGTTGCTAATGCCT 
CAGGAGCTTCAGGACTTTTCCAAACGATGCCAGGTTGGGGTTCAACAGCTACAGTTCAGG 
ATCAAGTTAATTCAGCTATTAAAGCTTATCGTGCTCAAGGTTTATCAGCTTGGGGTTAC 

SEQ ID NO. 3003: SAG2147 FROM THE 2603 V/R GBS TYPE V STRAIN 
(REVERSE COMPLEMENT) 

AAAAGTTCACAAGTTACTACTGAATCTTTGTCAAAAGCAGATAAAGT 
TCGCGTAGCCAAAAAATCAAAAATGACTAAGGCGACATCTAAATCAAAAGTAGAAGATGT 
AAAACAGGCTCCAAAACCTTCTCAGGCATCTAATGAAGCCCCAAAATCAAGTTCTCAATC 
TACAGAAGCTAATTCTCAGCAACAAGTTACTGCGAGTGAAGAGGCAGCTGTAGAACAAGC 
AGTTGTAACAGAAAACACCCCTGCTACCAGTCAGGCACAACAAGCTTATGCTGTTACTGA 
GACAACTTATAGACCTGCTCAACACCAGACGAGTGGCCAAGTATTGAGTAATGGAAATAC 
TGC^GGGGCTATTGGCTC^GC^GCTGC^GC^CAAATGGCTGCTGCAACAGGAGTCCCTCA 
GTCTACTTGGGAACATATTATTGCCCGTGAATCAAATGGTAATCCTAATGTTGCTAATGC 
CTCAGGAGCTTCAGGACTTTTCCAAACGATGCCAGGTTGGGGTTCAACAGCTACAGTTCA 
GGATCAAGTTAATTCAGCTATTAAAGCTTATCGTGCTCAAGGTTTATCAGCTTGGGGTTA 

C 

SEQ ID NO. 3004: SAG2147 FROM THE 090 GBS TYPE la STRAIN 
(REVERSE COMPLEMENT) 

TAGCCAAAAAATCAAAAATGATTAAGGCGACATCTAAATCAAAAGTAGAAGATGTAAAAC 
AGGCTCCAAAACCTTCTCAGGCATCTAATGAAGCCCCAAAATCAAGTTCTCAATCTACAG 
AAGCTAATTCTCAGCAACAAGTTACTGCGAGTGAAGAGGCAGCTGTAGAACAAGCAGTTG 
TAACAGAAAACACCCCTGCTACCAGTCAGGCACAACAAGCTTATGCTGTTACTGAGACAA 
CTTATAGACCTGCTCAACACCAGACGAGTGGCCAAGTATTGAGTAATGGAAATACTGCAG 
GGGCTATTGGCTCAGCAGCTGCAGCACAAATGGCTGCTGCAACAGGAGTCCCTCAGTCTA 
CTTGGGAACATATTATTGCCCGTGAATCAAATGGTAATCCTAATGTTGCTAATGCCTCAG 
GAGCTTC^GGACTTTTCCAAACGATGCCAGGTTGGGGTTCAACAGCTACAGTTCAG^ 

SEQ ID NO. 3005: SAG2147 FROM THE A909 GBS TYPE la STRAIN 
(REVERSE COMPLEMENT) 

AAGGCGACATCTAAATCAAAAGTAGAAGATGTAAAACAGGCTCCAAAACCTTCTCAGGCA 
TCTAATGAAGCCCCAAAATCAAGTTCTCAATCTACAGAAGCTAATTCTCAGCAACAAGTT 
' ACTGCGAGTGAAGAGGCAGCTGTAGAACAAGCAGTTGTAACAGAAAACACCCCTGCTACC 
AGTCAGGCACAACAAGCTTATGCTGTTACTGAGACAACTTATAGACCTGCTCAACACCAG 
ACAAGTGGCCAAGTATTGAGTAATGGAAATACTGCAGGGGCTATTGGCTCAGCAGCTGCA 
GCACAAATGGCTGCTGCAACAGGAGTCCCTCAGTCTACTTGGGAACATATTATTGCCCGT 
GAATCAAATGGTAATCCTAATGTTGCTAATGCCTCAGGAGCTTCAGGACTTTTCCAAACG 
ATGCCAGGTTGGGGTTCAACAGCTACAGTTCAGAATCAAGTTAATTCAGCTATTAAAGCT 

TATCGTGCTCAAGGTTTATCA 
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Table 30: Comparative Sequences relating to SAG214 
(protein of uknown function / lipoprotein, putative) 

SEQ ID NO. 3006: SAG2147 FROM THE CJB110 GBS NONTYPEABLE STRAIN 
(REVERSE COMPLEMENT) 

AATCTTTGTCAAAAGCAGATAAAGTTCGCGTAGCCAAAAAATCAAAAATGACTAAGGCGA 
CATCTAAATCAAAAGTAGAAGATGTAAAACAGGCTCCAAAACCTTCTCAGGCATCTAATG 
AAGCCCCAAAATCAAGTTCTCAATCTACAGAAGCTAATTCTCAGCAACAAGTTACTGCGA 
GTGAAGAGGCAGCTGTAGAACAAGCAGTTGTAACAGAAAACACCCCTGCTACCAGTCAGG 
CACAACAAGCTTATGCTGTTACTGAGACAACTTATAGACCTGCTCAACACCAGACGAGTG 
GCCAAGTATTGAGTAATGGAAATACTGCAGGGGCTATTGGCTCAGCAGCTGCAGCACAAA 
TGGCTGCTGCAACAGGAGTCCCTCAGTCTACTTGGGAACATATTATTGCCCGTGAATCAA 
ATGGTAATCCTAATGTTGCTAATGCCTCAGGAGCTTCAGGACTTTTCCAAACGATGCCAG 
GTTGGGGTTCAACAGCTACAGTTCAGGATCAAGTTAATTCAGCTATTAAAGCTTATCGTG 

CTCAAGGTTTATCAGCTTGGGGTTAC 

SEQ ID NO. 3007: SAG2147 FROM THE COHl GBS TYPE III STRAIN 
(REVERSE COMPLEMENT) 

AAAAGTTCACAAGTTACTACTGAATCTTTGTCAAAAGCAGATAA 

AGTTCGCGTAGCCAAAAAATCAAAAATGACTAAGGCGACATCTAAATCAAAAGTAGAAGA 
TGTAAAACAGGCTCCAAAACCTTCTCAGGCATCTAATGAAGCCCCAAAATCAAGTTCTCA 
ATCTACAGAAGCTAATTCTCAGCAACAAGTTACTGCGAGTGAAGAGGCGGCTGTAGAACA 
AGCAGTTGTAACAGAAAATACCCCTGCTACCAGTCAGGCACAACAAACTTATGCTGTTAC 
TGAGACAACTTACAAACCTGCTCAACACCAGACAAGTGGCCAAGTATTGAGCAATGGAAA 
TACTGCAGGGGCGGTCGGATCTGCTGCTGCAGCACAAATGGCTGCTGCAACAGGAGTCCC 
TCAGTCTACTTGGGAACATATTATTGCCCGTGAATCAAATGGTAATCCTAATGTTGCTAA 
TGCCTCAGGAGCTTCAGGACTTTTCCAAACGATGCCAGGTTGGGGTTCAACAGCTACAGT 
TCAGGATCAAGTTAATTCAGCTATTAAAGCTTATCGTGCTCAAGGTTTATCAGCTTGGGG 

TTAC 

SEQ ID NO. 3008: SAG2147 FROM THE H36b GBS TYPE lb STRAIN 
(REVERSE COMPLEMENT) 



AAAAGTTCACAAGTTACTACTGAATCTTTGTCAAAAGC 
AGATAAAGTTCGCGTAGCCAAAAAATCAAAAATGACTAAGGCGACATCTAAATCAAAAGT 

AGAAGATGTAAAACAGGCTCCAAAACCTTCTCAGGCATCTAATGAAGCCCCAAAATCAAG 

TTCTCAATCTACAGAAGCTAATTCTCAGCAACAAGTTACTGCGAGTGAAGAGGCAGCTGT 

AGAACAAGCAGTTGTAACAGAAAACACCCCTGCTACCAGTCAGGCACAACAAGCTTATGC 

TGTTACTGAGACAACTTATAGACCTGCTCAACACCAGACAAGTGGCCAAGTATTGAGTAA 

TGGAAATACTGCAGGGGCTATTGGCTCAGCAGCTGCAGCACAAATGGCTGCTGCAACAGG 

AGTCCCTCAGTCTACTTGGGAACATATTATTGCCCGTGAATCAAATGGTAATCCTAATGT 

TGCTAATGCCTCAGGAGCTTCAGGACTTTTCCAAACGATGCCAGGTTGGGGTTCAACAGC 



TACAGTTCAGGATCAAGTTAATTCAGCTATTAAAGCTT 

SEQ ID NO. 3009: SAG2147 FROM THE M732 GBS TYPE III STRAIN 
(REVERSE COMPLEMENT) 

AAAAGTTCACAAGTTACTACTGAATCTTTGTCAAAAGCAGATAAAGTTCGCGTAGC 
CAAAAAATCAAAAATGACTAAGGCGACATCTAAATCAAAAGTAGAAGATGTAAAACAGGC 
TCCAAAACCTTCTCAGGCATCTAATGAAGCCCCAAAATCAAGTTCTCAATCTACAGAAGC 
TAATTCTCAGCAACAAGTTACTGCGAGTGAAGAGGCGGCTGTAGAACAAGCAGTTGTAAC 
AGAAAATACCCCTGCTACCAGTCAGGCACAACAAACTTATGCTGTTACTGAGACAACTTA 
CAAACCTGCTCAACACCAGACAAGTGGCCAAGTATTGAGCAATGGAAATACTGCAGGGGC 
GGTCGGATCTGCTGCTGCAGCACAAATGGCTGCTGCAACAGGAGTCCCTCAGTCTACTTG 
GGAACATATTATTGCCCGTGAATCAAATGGTAATCCTAATGTTGCTAATGCCTCAGGAGC 
TTCAGGACTTTTCCAAACGATGCCAGGTTGGGGTTCAACAGCTACAGTTCAGGATCAAGT 

„„ m . mm « „ f.f.miiiinpr'rmpni'rsBnRTT'PSTnAGrTTGGGGTTA 



TAATTCAGCTATTAAAGCTTATCGTGCTCAAGGTTTATCAGCTTGGGGTTA 
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Table 30: Comparativ Sequences relating to SAG214' 
(prot in of uknown function /Upoprotein, putative) 

SEQ ID NO. 3010: SAG2147 FROM THE M781 GBS TYPE III STRAIN 
(REVERSE COMPLEMENT) 

GTftACCCCAAGCTGATAAACCTTGAGCACGATAAGCTTTAATAGCTGAATTAACTTGATC 
CTGARCTGTAGCTGTTGAACCCCAACCTGGCATCGTTTGGAAAAGTCCTGAAGCTCCTGA 
GGCATTAGCAACATTAGGATTACCATTTGATTCACGGGCAATAATATGTTCCCAAGTAGA 
CTGAGGGACTCCTGTTGCAGCAGCCATTTGTGCTGCAGCAGCAGATCCGACCGCCCCTGC 
AGTATTTCCATTGCTCAATACTTGGCCACTTGTCTGGTGTTGAGCAGGTTTGTAAGTTGT 
CTCAGTAACAGCATAAGTTTGTTGTGCCTGACTGGTAGCAGGGGTATTTTCTGTTACAAC 
TGCTTGTTCTACAGCCGCCTCTTCACTCGCAGTAACTTGTTGCTGAGAATTAGCTTCTGT 
AGATTGAGAACTTGATTTTGGGGCTTCATTAGATGCCTGAGAAGGTTTTGGAGCCTGTTT 
TACATCTTCTACTTTTGATTTAGATGTCGCCTTAGTCATTTTTGATTTTTTGGCTACGCG 

AACTTTATCTGCTTTTGACAAAGA 



OS 



SEQ3001 
SEQ3002 
SEQ3003 
SEQ3004 
SEQ3005 
SEQ3007 
SEQ3008 
SEQ3009 
SEQ3010 

SEQ3001 
SEQ3002 
SEQ3003 
SEQ3004 
SEQ3005 
SEQ3007 
SEQ300S 
SEQ3009 
SEQ3010 

SEQ3001 
SEQ3002 
SEQ3003 
SEQ3004 
SEQ3005 
SEQ3007 
SEQ3008 
SEQ3009 
SEQ3010 

SEQ3001 
SEQ3002 
SEQ3003 
SEQ3004 
SBQ3005 
SEQ3007 
SEQ3008 
SEQ3009 
SEQ3010 

SEQ3001 
SEQ3002 
SEQ3003 
SEQ3004 
6EQ3005 
SEQ3007 
SEQ3008 
SEQ3009 
SEQ3010 



AGGCGACATCTAAATCAAAAGTAGMGATGTAAAACAGGCTCCAAAACCTTCTCAGGCA 



CTAA' 



.TGAAGCCCCAAAATCAAGTTCTCAATCTACAGAAGCTAATTCTCAGCAACAAGTT 



CTGt 



ICGAGTGAAGAGGCAGCTGTAGAACAAGCAGTTGTAACAGAAAACACCCCTGCTACC 



GTCAGGCACAACAAGCTTATGCTGTTACTGAGACAACTTATAGACCTGCTCAACACCAG 



CAAGTGGCCAAGTATTGAGTAATGGAAATACTGCAGGGGCTATTGGCTC^GCAGCTGCA 
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SEQ3001 
SEQ3002 
SBQ3003 
SEQ3004 
SEQ3005 
SEQ3007 
SEQ3008 
SEQ3009 
SEQ3010 

SEQ3001 
SEQ3002 
SEQ3003 
SEQ3004 
SEQ3005 
SEQ3007 
SEQ3008 
SEQ3009 
SEQ3010 

SEQ3001 
SEQ3002 
SEQ3003 
SEQ3004 
SEQ3005 
SEQ3007 
SEQ3008 
SEQ3009 
SBQ3010 

SEQ3001 
SEQ3002 
SEQ3003 
SEQ3004 
SEQ3005 
SEQ3007 
SEQ3008 
SEQ3009 
SEQ3010 

SEQ3001 
SEQ3002 
SEQ3003 
SEQ3004 
SEQ3005 
SEQ3007 
SEQ3008 
SEQ3009 
SEQ30X0 

SEQ3001 
SEQ3002 
SEQ3003 
SEQ3004 
SEQ3005 
SEQ3007 
SEQ3008 
SEQ3009 
SEQ3010 

SEQ3001 
SEQ3002 
SEQ3003 
SEQ3004 
SEQ3005 
SEQ3007 
SEQ3008 
SEQ3009 
SEQ3010 



Table 30: Comparative Sequences relating to SAG214 
(protein fuknown function /Hp protein, putative) 
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CACAAATGGCTGCTGCAACAGGAGTCCCTCAGTCTACTTGGGAACATATTATTGCCCGT 



AATCAAATGGTAATCCTAATGTTGCTAATGCCTCAGGAGCTTCAGGACTTTTCCAAACG 



TGCCAGGTTGGGGTTCAACAGCTACAGTTCAGAATCAAGTTAATTCAGCTATTAAAGCT 



AAAGTTCACAAGTTACTACTGAATCTTTGTCAAAAGCAGATAAAGTTCGCGTAGCCAAA 
AAAGTTCACAAGTTACTACTGAATCTTTGTCAAAAGCAGATAAAGTTCGCGTAGCCAAA 
AAAGTTCACAAGTTACTAGTGAATCTTTGTCAAAAGCAGATAAAGTTCGCGTAGCCAAA 

TAGCCAAA 

ATCGTGCTCAAGGTTTATCASAATCTTTGTCAAAAGCAGATAAAGTTCGCGTAGCCAAA 
AAAGTTCACAAGTTACTACTGAATCTTTGTCAAAAGCAGATAAAGTTCGCGTAGCCAAA 
AAAGTTCACAAGTTACTACTGAATCTTTGTCAAAAGCAGATAAAGTTCGCGTAGCCAAA 
AAAGTTCACAAGTTACTACTGAATCTTTGTCAAAAGCAGATAAAGTTCGCGTAGCCAAA 
GTAACCCCAAGCTGA TAAACCTTGAGCACGATAAGCTTTAATAGCTGAA 

AATCAAAAATGACTAAGGCGACATCTAAATCAAAAGTAGAAGATGTAAAACAGGCTCCA 
AATCAAAAATGACTAAGGCGACATCTAAATCAAAAGTAGAAGATGTAAAACAGGCTCCA 
AATCAAAAATGACTAAGGCGACATCTAAATCAAAAGTAGAAGATGTAAAACAGGCTCCA 
AATCAAAAATGATTAAGGCGACATCTAAATCAAAAGTAGAAGATGTAAAACAGGCTCCA 
AATCAAAAATGACTAAGGCGACATCTAAATCAAAAGTAGAAGATGTAAAACAGGCTCCA 
AATCAAAAATGACTAAGGCGACATCTAAATCAAAAGTAGAAGATGTAAAACAGGCTCCA 
AATCAAAAATGACTAAGGCGACATCTAAATCAAAAGTAGAAGATGTAAAACAGGCTCCA 
AATCAAAAATGACTAAGGCGACATCTAAATCAAAAGTAGAAGATGTAAAACAGGCTCCA 
TAACTTGATCCTGAACTGTAGCTGTTGAACCCCAACCTGGCATCGTTTGGAAAAGTCCT 

AACCTTCTCAGGCATCTAATGAAGTCCCAAAATCAAGTTCTCAATCTACAGAAGCTAAT 
AACCTTCTCAGGCATCTAATGAAGCCCCAAAATCAAGTTCTCAATCTACAGAAGCTAAT 
AACCTTCTCAGGCATCTAATGAAGCCCCAAAATCAAGTTCTCAATCTACAGAAGCTAAT 
AACCTTCTCAGGCATCTAATGAAGCCCCAAAATCAAGTTCTCAATCTACAGAAGCTAAT 
AACCTTCTCAGGCATCTAATGAAGCCCCAAAATCAAGTTCTCAATCTACAGAAGCTAAT 
AACCTTCTCAGGCATCTAATGAAGCCCCAAAATCAAGTTCTCAATCTACAGAAGCTAAT 
AACCTTCTCAGGCATCTAATGAAGCCCCAAAATCAAGTTCTCAATCTACAGAAGCTAAT 
AACCTTCTCAGGCATCTAATGAAGCCCCAAAATCAAGTTCTCAATCTACAGAAGCTAAT 
AAGCTCCTGAGGCATT AGCAACATTAGGATTAC-CATTTGATTCACGGGCAATAAT 

TCTCAGCAACAAGTTACTGCGAGTGAAGAGGCGGCTGTAGAACAAGCAGTTGTAACAGA 
TCTCAGCAACAAGTTACTGCGAGTGAAGAGGCAGCTGTAGAACMGCAGTTGTAACAGA 
TCTCAGCAACAAGTTACTGCGAGTGAAGAGGC^GCTGTAGAACAAGCAGTTGTAACAGA 
TCTCAGC^ACAAGTTACTGCGAGTGAAGAGGCAGCTGTAGAACAAGCAGTTGTAACAGA 
TCTCAGCAACAAGTTACTGCGAGTGAAGAGGCAGCTGTAGAACAAGCAGTTGTAACAGA 
TCTCAGCAACAAGTTACTGCGAGTGAAGAGGCGGCTGTAGAACAAGCAGTTGTAACAGA 
TCTCAGCAACAAGTTACTGCGAGTGAAGAGGCAGCTGTAGAACAAGCAGTTGTAACAGA 
TCTCAGCAACAAGTTACTGCGAGTGAAGAGGCGGCTGTAGAACAAGCAGTTGTAACAGA 
TGTTCCCAAGTAGACTGAGGGACTCCTGTTGCAGCAGCCATTTGTGCTGCAGCAGCAGA 
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Table 30: Comparative Sequences relating to SAG214^^ 
(protein of uknown function / lip protein, putative) 

SEQ3001 — AAATACCCCTGCTACCAGTCAGGCACAACAAACTTATGCTGTTACTGAGACAACTTA 

SBQ3002 --AAACACCCCTGCTACCAGTCAGGCACAAC^GCTTATGCTGTTACTGAGACAACrTA 

SEQ3003 — AAACACCCCTGCTACCAGTCAGGCACAACAAGCTTATGCTGTTACTGAGACAACTTA 

SEQ3004 — AAACACCCCTGCTACCAGTCAGGCACAACAAGCTTATGCTGTTACTGAGACAACTTA 

SEQ3005 — AAACACCCCTGCTACCAGTCAGGCACAACAAGCTTATGCTGTTACTGAGACAACTTA 

SEQ3007 — AAATACCCCTGCTACC^GTCAGGCACAACAAACTTATGCTGTTACTGAGACAACTTA 

SEQ3008 — -AAACACCCCTGCTACCAGTCAGGCACAACAAGCTTATGCTGTTACTGAGACAACTTA 

SEQ3009 — AAATACCCCTGCTACCAGTCAGGCACAACAAACTTATGCTGTTACTGAGACAACTTA 

SEQ3 010 CCGACCGCCCCTGCAGTATTTCCATTGCTCAATACTTG-GCCACTTGTCTGGTGTTGAG 

SEQ300X AAACCTGCTCAACACCAGACAAGTGGC-CAAGTATTGAGCAATGGAAATACTGCAGGGG 

SEQ3002 AGACCTG CT CAACACC AGACGAGTGGC- CAAGT ATTGAGTAATGG AAATACTGCAGGGG 

SEQ3003 AGACCTGCTCAACACCAGACGAGTGGC-CAAGTATTGAGTAATGGAAATACTGCAGGGG 

SEQ3004 AGACCTGCTCAACACCAGACQAGTGGC-CAAGTATTGAGTAATGGAAATACTGCAGGGG 

SEQ3005 AGACCTGCTCAACACCAGACGAGTGGC-CAAGTATTGAGTAATGGAAATACTGCAGGGG 

SEQ3007 AAACCTGCT CAACACCAGACAAGTGGC-CAAGT ATTGAGCAATGGAAATACTG CAGGGG 

SEQ3008 AGACCTGCTCAACACCAGACAAGTGGC-GAAGTATTGAGTAATGGAAATACTGCAGGGG 

SEQ3009 AAACCTGCTCAACACCAGACAAGTGGC-CAAGTATTGAGCAATGGAAATACTGCAGGGG 

SEQ3010 AGGTTTGTAAGTTGTCTCAGTAACAGCATAAGTTTGTTGTGCCTGACTGGTAGCAGGGG 

SEQ3001 GGTCGGATCTGCTGCTGCAGCACAAATGGCTGCTGCAACAGGAGTCCCTCAGTCTACTT 

SEQ3002 TATTGGCTCAGCAGCTGCAGC^CAAATGGCTGCTGCAAC^GGAGTCCCTCAGTCTACTT 

SEQ3003 TATTGGCTCAGCAGCTGCAGCACAAATGGCTGCTGCAACAGGAGTCCCTCAGTCTACIT 

SEQ3004 TATTGGCTCAGCAGCTGCAGCACAAATGGCTGCTGCAACAGGAGTCCCTCAGTCTACTT 

SEQ3005 TATTGGCTCAGCAGCTGCAGCACAAATGGCTGCTGCAACAGGAGTCCCTCAGTCTACTT 
SEQ3007 GGTCGGATCTGCTGCTGCAGCACAAATGGCTGCTGCAACAGGAGTCCCTCAGTCTACTT 
SEQ3008 TATTGGCTCAGCAGCTGCAGCACAAATGGCTGCTGCAACAGGAGTCCCTCAGTCTACTT 
SEQ3009 GGTCGGATCTGCTGCTGCAGCACAAATGGCTGCTGCAACAGGAGTCCCTCAGTCTACTT 
SEQ3010 A-TTT — TCTGTTAC^ACTGCTTGTTCTACAGCCGCCTCTTCACTCGCAGTAACTTGTT 

SEQ3001 GGGAACATATTATTGCCCGTGAATCAAATGGTAATCCTAATGTTGCTAATGCCTCAGGAG 
SEQ3002 GGGAACATATTATTGCCCGTGAATCAAATGGTAATCCTAATGTTGCPAATGCCTCAGGAG 
SEQ3003 GGGAACATATTATTGCCCGTGAATCAAATGGTAATCCTAATGTTGCTAATGCCTCAGGAG 
SEQ3004 GGGAACATATTATTGCCCGTGAATCAAATGGTAATCCTAATGTTGCTAATGCCTCAGGAG 
SEQ3005 GGGAACATATTATTGCCCGTGAATCAAATGGTAATCCTAATGTTGCTAATGCCTCAGGAG 
SEQ3007 GGGAACATATTATTGCCCGTGAATCAAATGGTAATCCTAATGTTGCTAATGCCTCAGGAG 
SEQ3008 GGGAACATATTATTGCCCGTGAATCAAATGGTAATCCTAATGTTGCTAATGCCTCAGGAG 
SEQ3009 GGGAACATATTATTGCCCGTGAATCAAATGGTAATCCTAATGTTGCTAATGCCTCAGGAG 
SEQ3010 GCTGAGA-ATTAGCTTCTGTAGATTGAG AA — CTTGATTTTGGGGCTTCATTAGATG 

SEQ3001 CTTCAGGACTTTTCCAAACGATGCCAGGTTGGGGTTCAACAGCTACAGTTCAGGATCAAG 
SEQ3002 CTTCAGGACTTTTCCAAACGATGCCAGGTTGGGGTTCAACAGCTACAGTTCAGGATCAAG 
SEQ3003 CTTCAGGACTTTTCCAAACGATGCCAGGTTGGGGTTCAACAGCTACAGTTCAGGATCAAG 

SEQ300 4 CTTCAGGACTTTTCCAAACGATGCCAGGTTGGGGTTCAACAGCTACAGTTCAGGA 

SEQ3005 CTTCAGGACTTTTCCAAACGATGCCAGGTTGGGGTTCAACAGCTACAGTTCAGGATCAAG 
SEQ3007 CTTCAGGACTTTT CCAAACGATGCCAGGTTGGGGTTCAACAG CT ACAGTT CAGGATCAAG 
SEQ3008 CTTCAGGACTTTTCCAAACGATGCCAGGTTGGGGTTCAACAGCTACAGTTCAGGATCAAG 
SEQ3009 CTTCAGGACTTTTCCAAACGATGCCAGGTT GGGGTTCAACAGCT ACAGTT CAGGATCAAG 
SEQ3010 CCTGAGAAGGTTTT GGAGCCTGTTTTACATCTTCTACTTTTGATTTAGATGTCGC 

9EQ3001 TAATTCAGCTATTAAAGCTTATCGTGCTCAAGGTTTATCAGCTTGGGGTTAC 
SEQ3002 TAATTCAGCTATTAAAGCTTATCGTGCTCAAGGTTTATCAGCTTGGGGTTAC — 

SEQ3003 TAATTCAGCTATTAAAGCTTATCGTGCTCAAGGTTTATCAGCTTGGGGTTAC 

SEQ3004 

SEQ3005 TAATTCAGCTATTAAAGCTTATCGTGCTCAAGGTTTATCAGCTTGGGGTTAC — 

SEQ3007 TAATTCAGCTATTAAAGCTTATCGTGCTCAAGGTTTATCAGCTTGGGGTTAC — 

SEQ3008 TAATTCAGCTATTAAAGCTT 

SEQ3009 TAATTCAGCTATTAAAGCTTATCGTGCTCAAGGTTTATCAGCTTGGGGTTA 

SEQ3010 TTAGTCA-TTTTTGATTTTTTGGCTACGCGAACTTTATCTGCTTTTGACAAAGA 
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Table 30: Comparative Sequences relating to SAG2147 
(pr tein of uknown function / lipoprotein, putative) 

>SEQ ID NO 3050: 25 1169NT frame: 1 

KSSQVTTESLSKADKVRVAKKSE04TKATSKSKVEDVKQAPKPSQASNEVPKSSSQSTEAN 
SQQQVTASEEAAVEQAWTENTPATSQAQQTYAVTETTYKPAQHQTSGQVLSNGNTAGAV 
GSAAAAQMAAATGVPQSTWEHIIARESNGNPNVANASGASGLFQTMPGWGSTATVQDQVN 

SAIKAYRAQGLSAWGY 

>SEQ ZD NO 3051:25_X8RS21 frame: 1 

KSSQVTTESLSKADKVRVAKKSKMTKATSKSKVEDVKQAPKPSQASNEAPKSSSQSTEAN 
SQQQVTASEEAAVEQAWTENT PAT SQAQQAYAVTETT YRPAQHQTSGQVLSNGNTAGAI 
GSAAAAQMAAATGVPQSTWEHI IARE SNGN PNVANASGASGLFQTMPGWGSTATVQDQVN 
SAIKAYRAQGLSAWGY 

>SEQ ID NO 3052:25 2603 frame: 1 

KSSQVTTESLSKADKVRVAKKSKMTKATSKSKVEDVKQAPKPSQASNEAPKSSSQSTEAN 
SQQQVTASEEAAVEQAWTENT PAT SQAQQAYAVTETT YRPAQHQTSGQVLSNGNTAGAI 
GSAAAAQMAAATGVPQSTWEHI IARE SNGN PNVANASGASGLFQTMPGWGSTATVQDQVN 
SAIKAYRAQGLSAWGY 

>SEQ ID NO 3053:25_090 frame: 3 

AKKSKMIKATSKSKVEDVKQAPKPSQASNEAPKSSSQSTEANSQQQVTASEEAAVEQAW 
TENTPATSQAQQAYAVTETTYRPAQHQTSGQVLSNGNTAGAIGSAAAAQMAAATGVPQST 
WEHIIARESNGNPNVANASGASGLFQTMPGWGSTATVQ 

>SEQ ID NO 3054:25 A909 frame: 1 

KATSKSKVEDVKQAPKPSQASNEAPKS S SQSTEANSQQQVTASEEAAVEQAWTENT PAT 
SQAQQAYAVTETTYRPAQHQTSGQVLSNGNTAGAIGSAAAAQMAAATGVPQSTWEHIIAR 
ESNGNPNVANASGASGLFQTMPGWGSTATVQNQVNSAIKAYRAQGLS 

>SEQ ID NO 3055:25JCJB110 frame: 3 

SLSKADKVRVAKKSKMTKATSKSKVEDVKQAPKPSQASNEAPKSSSQSTEANSQQQVTAS 
EEAAVEQAVVTENTPATSQAQQAYAVTETTYRPAQHQTSGQVLSNGNTAGAIGSAAAAQM 
AAATGVPQSTWEHIIARESNGNPNVANASGASGLFQTMPGWGSTATVQDQVNSAIKAYRA 

QGLSAWGY 

>SEQ ID NO 3056:25JCOH1 frame: X 

KSSQVTTESLSKADKVRVAKKSKMTKATSKSKVEDVKQAPKPSQASNEAPKSSSQSTEAN 
SQQQVTASEEAAVEQAWTENTPATSQAQQTYAVTETTYKPAQHQTSGQVLSNGNTAGAV 
GSAAAAQMAAATGVPQSTWEHI I ARESNGN PNVANASGASGLFQTMPGWGSTATVQDQVN 
SAIKAYRAQGLSAWGY 

>SEQ ID NO 3057:25_H36B frame: 1 

KSSQVTTESLSKADKVRVAKKSKMTKATSKSKVEDVKQAPKPSQASNEAPKSSSQSTEAN 
SQQQVTASEEAAVEQAWTENTPATSQAQQAYAVTETTYRPAQHQTSGQVLSNGNTAGAI 
GSAAAAQMAAATGVPQSTWEHIIARESNGNPNVANASGASGLFQTMPGWGSTATVQDQVN 

SAIKA 

>SEQ ID NO 3058:25_M732 frame: 1 

KSSQVTTESLSKADKVRVAKKSKMTKATSKSKVEDVKQAPKPSQASNEAPKSSSQSTEAN 
SQQQVTASEEAAVEQAWTENTPATSQAQQTYAVTETTYKPAQHQTSGQVLSNGNTAGAV 
GSAAAAQMAAATGVPQSTWEHIIARESNGN PNVANASGASGLFQTMPGWGSTATVQDQVN 

SAIKAYRAQGLSAWG 

>SEQ ID NO 3059:25_M781 frame: 4 

SLSKADKVRVAKKSKMTKATSKSKVEDVKQAPKPSQASNEAPKS S SQSTEAN SQQQVT AS 
EEAAVEQAWTENTPATSQAQQTYAVTETTYKPAQHQTSGQVLSNGNTAGAVGSAAAAQM 
AAATGVPQSTWEHI IARESNGN PNVANASGASGLFQTMPGWGSTATVQDQVNS AIKAYRA 
QGLSAWGY 

SEQ3050 SSQVTTESLSKADKVRVAKKSKMTKATSKSKVEDVKQAPKPSQASNEVPKSSSQSTEAN 
SEQ3051 SSQVTTESLSKADKVRVAKKSKMTKATSKSKVEDVKQAPKPSQASNEAPKSS SQSTEAN 

SEQ3052 SSQVTTESLSKADKVRVAKKSKMTKATSKSKVEDVKQAPKPSQASNEAPKSSSQSTEAN 

SEO3053 AKKSKMIKATSKSKVEDVKQAPKPSQASNEAPKSSSQSTEAN 

SEQ3054 KATSKSKVEDVKQAPKPSQASNEAPKS S SQSTEAN 

SEQ3055 SLSKADKVRVAKKSKMTKATSKSKVEDVKQAPKPSQASNEAPKSS SQSTEAN 

SEQ3056 SSQVTTESLSKADKVRVAKKSKMTKATSKSKVEDVKQAPKPSQASNEAPKSSSQSTEAN 
SEQ3057 ssQVTTESLSKADKVRVAKKSKMTKATSKSKVEDVKQAPKPSQASNEAPKSSSQSTEAN 
SEQ3058 ssQVTTESLSKADKVRVAKKSKMTKATSKSKVEDVKQAPKPSQASNEAPKSSSQSTEAN 
SEQ3059 SLSKADKVRVAKKSKMTKATSKSKVEDVKQAPKPSQASNEAPKSSSQSTEAN 
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SEQ3050 
SEQ3051 
8EQ3052 
SEQ3053 
SEQ3054 
SEQ3055 
SEQ3056 
SEQ3057 
SEQ3058 
SEQ3059 

SEQ3050 
SEQ3051 
SEQ3052 
SEQ3053 
SEQ3054 
SEQ3055 
SEQ3056 
SEQ3057 
SEQ3058 
SEQ3059 

SEQ3050 
SEQ3051 
SEQ3052 
SEQ3053 
SEQ3054 
SEQ3055 
SEQ3056 
SEQ3057 
SEQ3058 
SEQ3059 



Table 30: Comparative Sequences relating t SAG214' 
(protein of uknown functi n / lipoprotein, putative) 



§ 



SQQQWASEEAAVEQAVVTENTPATSQAQQTYAVTETTYKPAQHQTSGQVLSNGNTAGAV 
SQQQVTASEEAAVEQAWTENTPATSQAQQAYAOTETTYRPAQHQTSGQVLSNGNTAGAI 
SQQQVTASEEAAVEQAWTENTPATSQAQQAYAVTETTYRPAQHQTSGQVLSNGNTAGAI 
SQQQVTASEEAAVEQAWTENT PATS QAQQAYAVTETTYRPAQHQT SGQVLSNGNT AGAI 
SQQQVTASEEAAVEQAWTENTPATSQAQQAYAVTETTYRPAQHQTSGQVLSNGNTAGAI 
SQQQVTASEEAAVEQAWTENTPAT SQAQQAYAVTETT YRPAQHQT SGQVLSNGNTAGAI 
SQQQVTASEEAAVEQAWTENTPATSQAQQTYAVTETTYKPAQHQTSGQVLSNGNTAGAV 
SQQQVTASEEAAVEQAVVTENTPATSQAQQAYAVTETTYRPAQHQTSGQVLSNGNTAGAI 
SQQQVTAS EEAAVEQAVVTENTPAT SQAQQT YAVTETT YKPAQHQTSGQVLSNGNT AGAV 
SQQQVTASEEAAVEQAWTENTPAT S QAQQTYAVTETTYK PAQHQT SGQVLSNGNT AGAV 

GSAAAAQMAAATGVPQSTWEHIIARESNGNPNVANASGASGLFQTMPGWGSTATVQDQVN 
GSAAAAQMAAATGVPQSTWEHIIARESNGNPNVANASGASGLFQTMPGWGSTATVQDQVN 
GSAAAAQMAAATGVPQSTWEHIIARESNGNPNVANASGASGLFQTMPGWGSTATVQDQVN 

GSAAAAQMAAATGVPQSTWEHIIARESNGNPNVANASGASGLFQTMPGWGSTATVQ 

GSAAAAQMAAATGVPQSTWEHIIARESNGNPNVANASGASGLFQTMPGWGSTATVQNQVN 
GSAAAAQMAAATGVPQSTWEHIIARESNGNPNVANASGASGLFQTMPGWGSTATVQDQVN 
GS AAAAQMAAATGVPQSTWEHI IARE SNGNPNVANASGASGLFQTMPGWGSTATVQDQVN 
GSAAAAQMAAATGVPQSTWEHIIARESNGNPNVANASGASGLFQTMPGWGSTATVQDQVN 
GSAAAAQMAAATGVPQSTWEHIIARESNGNPNVANASGASGLFQTMPGWGSTATVQDQVN 
GSAAAAQMAAATGVPQSTWEHIIARESNGNPNVANASGASGLFQTMPGWGSTATVQDQVN 

AIKAYRAQGLSAWGY 
AIKAYRAQGLSAWGY 
AIKAYRAQGLSAWGY 



AIKAYRAQGLS 

AIKAYRAQGLSAWGY 
AIKAYRAQGLSAWGY 

AIKA 

AIKAYRAQGLSAWG- 
AIKAYRAQGLSAWGY 
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Table 31: Comparative Sequences relating to SAG2148 
(LysM domain protein) 




SEQ ID HO. 3101: SAG2148 FROM THE 1169NT1 GBS TYPE V STRAIN 

GCATCTTATACCGTGAAATCAGGTGATACCTTATCAGCTATTGCTAAAAATCATAAAACTACGGTACAAGAGTTAGTG 
TCTCTCAATAGTATCAGTAACGCTGATGTCATCAGTATAGGTGATGTTTTAAAATTGGATAATTCTACAGCTAGTCAA 
GCAGAAGCAAAATCTCAACCAACAATTGAAAATTCAATGAATTCTTCATCAAATTTGAGTTCAAGTGATTCAGCTGCA 
AAAGAAGAAATAGCTCGTCGTGAATCAAATGGTAGTTATACTGCACAGAATGGACAATATTATGGAAGATATCAACTG 
• TCTCAATCTTACCTAAATGGCGACTTATCTCCTGAAAATCAAGAAAAAGTAGCGGACAATTATGTGGCTTCTCGTTAC 
GGATCTTGGTCGGCAGCGCTATCATTTTGGAATAGTAACGGCTGGTAT 

SEQ ID NO. 3102: SAG2148 FROM THE 18RS21 GBS TYPE II STRAIN 

GCATCTTATACCGTGAAATCAGGTGATACCTTATCAGCTATTGCTAAAAATCATAAAACTACGGTACAAGAGTTAGTG 
TCTCTCAATAGTATCAGTAACGCTGATGTCATCAGTATAGGTGATGTTTTAAAATTGGATAATTCTACAGCTAGTCAA 
GCAGAAGCAAAATCTCAACCAACAAT TGAAAATTCAATG AATTCTTCATCAAAT TTGAGTTCAAGTGAT TCAGCCGCA 
AAAGAAGAAATAGCTCGTCGTGAATCAAATGGTAGTTATACTGCACAGAATGGACAATATTATGGAAGATATCAACTG 
TCTCAATCTTACCTAAATGGCGACTTATCTCCTGAAAATCAAGAAAAAGTAGCGGACAATTATGTGGTTTCTCGTTAC 
GGATCTTGGTCGGCAGCGCT AT CAT T TTGGAATAGTAACGGCTGGT AT 

SEQ ID NO. 3103: SAG2148 FROM THE 2603 V/R GBS TYPE V STRAIN 

GCATCTT ATACCGTGAAATCAGGTGATACCT TATCAGCT AT TGCTAAAAAT CAT AAAACTACGGT ACAAGAGTTAGTG 
TCTCTCAATAGTATCAGTAACGCTGATGTCATCAGTATAGGTGATGTTTTAAAATTGGATAATTCTACAGCTAGTCAA 
GCAGAAGCAAAAT CTCAACCAACAATTGAAAATTCAATGAATTCTTCATCAAATT TGAGTTCAAGTGATT CAGCCGCA 
AAAGAAGAAATAGCTCGTCGTGAATCAAATGGTAGTTATACTGCACAGAATGGACAATATTATGGAAGATATCAACTG 
TCTCAATCTTACCTAAATGGCGACTTATCTCCTGAAAATCAAGAAAAAGTAGCGGACAATTATGTGGTTTCTCGTTAC 
GGATCTTGGTCGGCAGCGCTATCATTTTGGAATAGTAACGGCTGGTAT 

SEQ ID NO. 3104: SAG2148 FROM THE 090 GBS TYPE la STRAIN 

GCATCTTATACCGTGAAATCAGGTGATACCTTATCAGCTATTGCTAAAAATCATAAAACTACGGTACAAGAGTTAGTG 
TCTCTCAATAGTATCAGTAACGCTGATGTCATCAGTATAGGTGATGTTTTAAAATTGGATAATTCTAAAGCTAGTCAA 
GCAGAAGCAAAATCTCAACCAACAATTGAAAATTCAATGAATTCTTCATCAAATTTGAGTTCAAGTGATTCAGCCGCA 
AAAGAAGAAATAGCTCGTCGTGAATCAAATGGTAGTTATACTGCACAGAATGGACAATATTATGGAAGATATCAACTG 
TCTCAATCTTACCTAAATGGCGACTTATCTCCTGAAAATCAAGAAAAAGTAGCGGACAATTATGTGGTTTCTCGTTAC 
GGATCTTGGTCGGCAGCGCTATCATTTTGGAATAGTAACGGCTGGTAT 

SEQ ID NO. 3105: SAG2148 FROM THE A909 GBS TYPE la STRAIN 

GCATCTTATACCGTGT^AATCAGGTGATACCTTATCAGCTATTGCTAAAAATCATAAAACTACGGTACAAGAGTTAGTG 
TCTCTCAATAGTATCAGTT^ACGCTGATGTCATCAGTATAGGTGATGTTTTAAAATTGGATAATTCTACAGCTAGTCAA 
GC^GAAGGAAAATCTCAACCAACAATTGAAAATTCAATGAATTCTTCA 

AAAGAAGAAATAGCTCGTCGTGAATCAAATGGTAGTTATACTGCACAGAATGGACAATATTATGGAAGATATCAACTG 
TCTCAATCTTACCTAAATGGCGACTTATCTCCTGAAAATCAAGAAAAAGT AGCGGACAATT AT GTGGCT TCTCGT T AC 
GGATCTTGGTCGGCAGCGCT ATCATT TTGGAATAGTAACGGCT GGTAT 

SEQ ID NO. 3106: SAG2148 FROM THE CJB110 GBS NONTYPEABLE STRAIN 

GCATCTTATACCGTGAAATCAGGTGATACCTTATCAGCTATTGCTAAAAATCATAAAACTACGGTAC7VAGAGTTAGTG 
TCTCTCAATAGTATCAGTAACGCTGATGTCATCAGTATAGGTGATGTTTTAAAATTGGATAATTCTAAAGCTAGTCAA 
GCAGAAGCAAAATCTCAACCAACAATTGAAAATTCAATGAATTCTTCATCAAATTTGAGTTCAAGTGATTCAGCCGCA 
AAAGAAGAAATAGCTCGTCGTGAATCAAATGGTAGTTATACTGCACAGAATGGACAATATTATGGAAGATATCAACTG 
TCTCAATCT TACCTAAATGGCG ACTTATCTCCTGAAAATCAAGAAAAAGTAGCGGACAAT TATGTGGTTTCTCGT TAC 
GGATCTTGGTCGGCAGCGCTATCATTTTGGAATAGTAACGGCTGGTAT 

SEQ ID NO. 3107: SAG2148 FROM THE COHl GBS TYPE III STRAIN 

GCATCTTATACCGTGAAATCAGGTGATACCTTATCAGCTATTGCTAAAAATCATAAAACTACGGTACAATAGTTAGTG 

TCTCTCAATAGTATCAGTAACGCTGATGTCATCAGTATAGGTGATGTTTTAAAATTGGATAATTCTACAGCTAGTCAA 

GCAGAAGCAAAATCTCAACCAACAATTGAAAATTCAATGAATTCTTCATCAAATTTGAGTTCAAGTGATTCAGCTGCA 

AAAGAAGAAATAGCTCGTCGTGAATCAAATGGTAGTTATACTGCACAGAATGGACAATATTATGGAAGATATCAAC^ 

TCTCAATCTTACCTAAATGGCGACTTATCTCCTGAAAATCAAGAAAAAGTAGCGGACAATTATGTGGCTTCTCGTTAC 

GGATCTTGGTCGGCAGCGCTATCATTTTGGAATAGTAACGGCTGGTAT 
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Tabl 31: Comparativ Sequences relating to SAG2148 
(LysM domain protein) 

SEQ ID NO, 3108: SAG2148 FROM THE H36b GBS TYPE lb STRAIN 
(REVERSE COMPLEMENT) 

GCATCTTATACCGTGAAATCAGGTGATACCTTATCAGCTATTGCTAAAAATCATAAAACTACGGTACAAGAGTTAGTG 
TCTCTCAATAGTATCAGTAACGCTGATGTCATCAGTATAGGTGATGTTTTAAAATTGGATAATTCTACAGCTAGTCAA 
GCAGAAGCAAAATCTCAACCAACAATTGAAAATTCAATGAATTCTT CATCAAAT TTG AGTTCAAGTGATT CAGCCGCA 
AAAGAAGAAATAGCTCGTCGTGAATCAAATGGTAGTTATACTGCACAGAATGGACAATATTATGGAAGATATGAACTG 
TCTCAATCTTACCTAAATGGCGACTTATCTCCTGAAAATCAAGAAAAAGTAGCGGACAATTATGTGGCTTCTCGTTAC 
GGATCTTGGTCGGCAGCGCTATCATTTTGGAATAGTAACGGCTGGTAT 

SEQ ID NO, 3109: SAG2148 FROM THE JM9130013 GBS TYPE VIII STRAIN 
(REVERSE COMPLEMENT) 



(REVERSE COMPLEMENT) 

GCATCTTATACCGTGAAATCAGGTGATACCTTATCAGCTATTGCTAAAAATCATAAAACTACGGTACAAGAGTTAGTG 
TCTCTCAATAGTATCAGTAACGCTGACGTCATCAGTATAGGTGATGTTTTAAAATTGGATAATTCTACAACTAGTCAA 
GCAGAAGCAAAATCTCAACCAACAATTGAAAATTCAATGAATTCTTCATCAAATTTGAGTTCAAGTGATTCAGCCGCA 
AAAGAAGAAATAGCTCGTCGTGAATCAAATGGTAGTTATACTGCACAGAATGGACAATATTATGGAAGATATCAACTG 
TCTCAATCTTACCTAAATGGCGACTTATCTCCTGAAAATCAAGAAAAAGTAGCGGACAATTATGTGGCTTCTCGTTAC 
GGATCTTGGTCGGCAGCGCTATCATTTTGGAATAGTAACGGCTGGTAT 

SEQ ID NO. 3110: SAG2148 FROM THE M732 GBS TYPE III STRAIN 

GCATCTTATACCGTGAAATCAGGTGATACCTTATCAGCTATTGCTAAAAATCATAAAACTACGGTACAATAGTTAGTG 
TCTCTCAATAGTATCAGTAACGCTGATGTCATCAGTATAGGTGATGTTTTAAAATTGGATAATTCTACAGCTAGTCAA 
GCAGAAGCAAAATCTCAACCAACAATTGAAAATTCAATGAATTCTTCATCAAATTTGAGTTCAAGTGATTCAGCTGCA 
AAAGAAGAAATAGCTCGTCGTGAATCAAATGGTAGTTATACTGCACAGAATGGACAATATTATGGAAGATATCAACTG 
TCTCAATCTTACCTAAATGGCGACTTATCTCCTGAAAATCAAGAAAAAGTAGCGGACAATTATGTGGCTTCTCGTTAC 
GGATCTTGGTCGGCAGCGCTATCATTTTGGAATAGTAACGGCTGGTAT 

SEQ ID NO. 3111: SAG2148 FROM THE M781 GBS TYPE III STRAIN 
(REVERSE COMPLEMENT) 

GCATCTTATACCGTGAAATCAGGTGATACCTTATCAGCTATTGCTAAT^AATCATAAAACTACGGTACAATAGTTAGTG 
n™™™ n * n ™* a r ^ 



GCAGAAGCAAAATCTCAACCAACAATTGAAAATTCAATGAATTCTTCATCAAATTTGAGa 

AAAGAAGAAATAGCTCGTCGTGAATCAAATGGTAGTTATACTGCACAGAATGGAGAATATTATGGAAGATATCAACTG 

tctcaatcttacctaaatggcgacttatctcctgaaaatcaagaaaaagtagcggao^ttatgtggcttctcgttac 
ggatcttggtcggcagcgctatcattttggaatagtaacggctggtat 

seq3101 gcatcttataccgtgaaatcaggtgataccttatcagctattgctaaaaatcataaaact 

seq3102 gc^tcttataccgtgaaatcaggtgataccttatcagctattgctaaaaatcataaaact 

seq3103 * gcatcttataccgtgaaatcaggtgataccttatcagctattgctaaaaatcataaaact 

seq3104 gcatcttataccgtgaaatcaggtgataccttatcagctattgctaaaaatcataaaact 

seq3105 gcatcttataccgtgaaatcaggtgataccttatcagctattgctaaaaatcataaaact 

seq310 6 gcatcttataccgtgaaatcaggtgataccttatcagctattgctaaaaatcataaaact 

seq3107 gcatcttataccgtgaaatcaggtgataccttatcagctattgctaaaaatcataaaact 

seq3108 gcatcttataccgtgaaatcaggtgataccttatcagctattgctaaaaatcataaaact 

seq3109 gcatcttataccgtgaaatcaggtgataccttatcagctattgctaaaaatcataaaact 

seq3110 gcatcttataccgtgaaatcaggtgataccttatcagctattgctaaaaatcataaaact 

seq3111 gcatcttataccgtgaaatcaggtgataccttatcagctattgctaaaaatcataaaact 

seq3101 acggtacaagagttagtgtctctcaatagtatcagtaacgctgatgtcatcagtataggt 

seq3102 acggtacaagagttagtgtctctcaatagtatcagtaacgctgatgtcatcagtataggt 

seq3103 acggtacaagagttagtgtctctcaatagtatcagtaacgctgatgtcatcagtataggt 

seq3104 acggtacaagagttagtgtctctcaatagtatcagtaacgctgatgtcatcagtataggt 

seq3105 acggtacaagagttagtgtctctcaatagtatcagtaacgctgatgtcatcagtataggt 

seq3106 acggtacaagagttagtgtctctcaatagtatcagtaacgctgatgtcatcagtataggt 

seq3107 acggtacaatagttagtgtctctcaatagtatcagtaacgctgatgtcatcagtataggt 

seq3108 acggtacaagagttagtgtctctcaatagtatcagtaacgctgatgtcatcagtataggt 

seq3109 acggtacaagagttagtgtctctcaatagtatcagtaacgctgacgtcatcagtataggt 

seq3110 acggtacaatagttagtgtctctcaatagtatcagtaacgctgatgtcatcagtataggt 

seq3111 acggtacaatagttagtgtctctcaatagtatcagtaacgctgatgtcatcagtataggt 
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Table 31: Comparative Sequences relating t SAG2148 
(LysM domain protein) 




SEQ3101 
SEQ3102 
SEQ3103 
SEQ3104 
SEQ3105 
SEQ3106 
SEQ3107 
SEQ3X08 
SEQ3109 
SEQ3110 
SEQ3111 

SEQ310X 
SEQ3102 
SEQ3103 
SEQ3104 
SEQ3105 
SEQ3106 
SEQ3107 
SEQ3108 
SEQ3109 
SEQ3110 
SEQ3111 

SEQ3101 
SEQ3102 
SEQ3103 
SEQ3104 
SEQ3105 
SEQ3106 
SEQ3107 
SEQ3108 
SEQ3109 
SEQ3110 
SEQ3111 

SEQ3101 
SEQ3102 
SEQ3103 
SBQ3104 
SEQ3105 
SEQ3106 
SEQ3107 
SEQ3108 
SEQ3109 
SEQ3110 
SEQ3111 

SEQ3101 
SEQ3102 
SEQ3103 
SEQ3104 
SEQ3105 
SEQ3106 
SEQ3107 
SEQ3108 
SEQ3109 
SEQ3110 
SEQ3111 



GATGTTTTAAAATTGGATAATTCTACAGCTAGTCAAGCAGAAGOUU^TCTCAACCAACA 
GATGTTTTAAAATTGGATAATTCTACAGCTAGTCAAGCAGAAGCAAAATCTCAACCAACA 
GATGTTTTAAAATTGGATAATTCTAAAGCTAGTCAAGCAGAAGCAAAATCTCAACCAACA 
GATGTTTTAAAATTGGATAATTCTACAGCTAGTCAAGCAGAAGCAAAATCTCAACCAACA 
GATGTTTTAAAATTGGATAATTCTAAAGCTAGTCAAGCAGAAGCAAAATCTCAACCAACA 
GATGTTTTAAAATTGGATAATTCTACAGCTAGTCAAGCAGAAGCAAAATCTCAACCAACA 
GATGTTTTAAAATTGGATAATTCTACAGCTAGTCAAGCAGAAGCAAAATCTCAACCAACA 
GATGTTTTAAAATTGGATAATTCTACIAACTAGTCAAGCAGAAGCAAAATCTCAACCAACA 
GATGTTTTAAAATTGGATAATTCTACAGCTAGTCAAGCAGAAGCAAAATCTCAACCAACA 
GATGTTTTAAAATTGGATAATTCTACAGCTAGTCAAGCAGAAGCAAAATCTCAACCAACA 

ATTGAAAATTCAATGAATTCTTCATCAAATTTGAGTTCAAGTGATTCAGCTGCAAAAGAA 
ATTGAAAATTCAATGAATTCTTCATCAAATTTGAGTTCAAGTGATTCAGCCGCAAAAGAA 
ATTGAAAATTCAATGAATTCTTCATCAAATTTGAGTTC7VAGTGATTCAGCCGCAAAAGAA 
ATTGAAAATTCAATGAATTCTTCATCAAATTTGAGTTCAAGTGATTCAGCCGCAAAAGAA 
ATTGAAAATTCAATGAATTCTTCATCAAATTTGAGTTCAAGTGATTCAGCCGCAAAAGAA 
ATTGAAAATTCAATGAATTCTTCATCAAATTTGAGTTCAAGTGATTCAGCCGCAAAAGAA 
ATTGAAAATTCAATGAATTCTTCATCAAATTTGAGTTCAAGTGATTCAGCTGCAAAAGAA 
ATTGAAAATTCAATGAATTCTTCATCAAATTTGAGTTCAAGTGATTGAGCCGCAAAAGAA 
ATTGAAAATTCAATGAATTCTTCATCAAATTTGAGTTCAAGTGATTCAGCCGCAAAAGAA 
ATTGAAAATT CAATG AATT CTT CATCAAATTT GAGTT CAAGTGATTCAGCTGCAAAAG AA 
ATTGAAAATTCAATGAATTCTTCATCAAATTTGAGTTCAAGTGATTCAGCTGCAAAAGAA 

GAAATAGCTCGTCGTGAATCAAATGGTAGTTATACTGCACAGAATGGACAATATTATGGA 
GAAATAGCTCGTCGTGAATCAAATGGTAGTTATACTGCACAGAATGGACAATATTATGGA 
GAAATAGCTCGTCGTGAATCAAATGGTAGTTATACTGCACAGAATGGACAATATTATGGA 
GAAATAGCTCGTCGTGAATCAAATGGTAGTTATACTGCACAGAATGGACAATATTATGGA 
GAAATAGCTCGTCGTGAATCAAATGGTAGTTATACTGCACAGAATGGACAATATTATGGA 
GAAATAGCTCGTCGTGAATCAAATGGTAGTTATACTGCACAGAATGGACAATATTATGGA 
GAAATAGCTCGTCGTGAATCAAATGGTAGTTATACTGCACAGAATGGACAATATTATGGA 
GAAATAGCTCGTCGTGAATCAAATGGTAGTTATACTGCACAGAATGGACAATATTATGGA 
GAAATAGCTCGTCGTGAATCAAATGGTAGTTATACTGCACAGAATGGACAATATTATGGA 
GAAATAGCTCGTCGTGAATCAAATGGTAGTTATACTGCACAGAATGGACAATATTATGGA 
GAAATAGCTCGTCGTGAATCAAATGCTAGTTATACTGCACAGAATGGACAATATTATGGA 

AGATATCAACTGTCTCRATCTTACCTAAATGGCGACTTATCTCCTGAAAATCAAGAAAAA 
AGATATCAACTGTCTCAATCTTACCTAAATGGCGACTTATCTCCTGAAAATCAAGAAAAA 
AGATATCAACTGTCTCAATCTTACCTAAATGGCGACTTATCTCCTGAAAATCAAGAAA2VA 
AGATATCAACTGTCTCAATCTTACCTAAATGGCGACTTATCTCCTGAAAATCAAGAAAAA 
AGATATCAACTGTCTCAATCTTACCTAAATGGCGACTTATCTCCTGAAAATCAAGAAAAA 
AGATATCAACTGTCTCAATCTTACCTAAATGGCGACTTATCTCCTGA7VAATCAAGAAAAA 
AGATATCAACTGTCTCAATCTTACCTAAATGGCGACTTATCTCCTGAAAATCAAGAAAAA 
AGATATCAACTGTCTCAATCTTACCTAAATGGCGACTTATCTCCTGAAAATCAAGAAAAA 
AGATATCAACTGTCTCAATCTTACCTAAATGGCGACTTATCTCCTGAAAATCAAGAAAAA 
AGAT ATCAACTGT CT CAATCTTACCTAAATGGCGACTT ATCT CCTGAAAATCAAGAAAAA 
AGATATCAACTGTCTCAATCTTACCTAAATGGCGACTTATCTCCTGAAAATCAAGAAAAA 

GTAGCGGACAATTATGTGGCTTCTCGTTACGGATCTTGGTCGGCAGCGCTATCATTTTGG 
GTAGCGGACAATTATGTGGTTTCTCGTTACGGATCTTGGTCGGCAGCGCTATCATTTTGG 
GTAGCGGACAATTATGTGGTTTCTCGTTACGGATCTTGGTCGGCAGCGCTATCATTTTGG 
GTAGCGGACAATTATGTGGTTTCTCGTTACGGATCTTGGTCGGCAGCGCTATCATTTTGG 
GTAGCGGACAATTATGTGGCTTCTCGTTACGGATCTTGGTCGGCAGCGCTATCATTTTGG 
GTAGCGGACAATTATGTGGTTTCTCGTTACGGATCTTGGTCGGCAGCGCTATCATTTTGG 
GTAGCGGACAATTATGTGGCTTCTCGTTACGGATCTTGGTCGGCAGCGCTATCATTTTGG 
GTAGCGGACAATTATGTGGCTTCTCGTTACGGATCTTGGTCGGCAGCGCTATCATTTTGG 
GTAGCGGACAATTATGTGGCTTCTCGTTACGGATCTTGGTCGGCAGCGCTATCATTTTGG 
GTAGCGGACAATTATGTGGCTTCTCGTTACGGATCTTGGTCGGCAGCGCTATCATTTTGG 
GTAGCGGACAATTATGTGGCTTCTCGTTACGGATCTTGGTCGGCAGCGCTATCATTTTGG 
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Table 31: Compar tive Sequences relating to SAG2148 
(LysM domain protein) 

SEQ3101 AATAGTAACGGCTGGTAT 

SEQ3102 AATAGTAACGGCTGGTAT 

SEQ3103 AATAGTAACGGCTGGTAT 

SEQ3104 AATAGTAACGGCTGGTAT 

SEQ3105 AATAGTAACGGCTGGTAT 

SEQ3106 AATAGTAACGGCTGGTAT 

SEQ3107 AATAGTAACGGCTGGTAT 

SEQ3108 AATAGTAACGGCTGGTAT 

SBQ3109 AATAGTAACGGCTGGTAT 

SEQ3110 AATAGTAACGGCTGGTAT 

SEQ3111 AATAGTAACGGCTGGTAT 

>SEQ ID NO 3150:15_1169NT frame: 1 

ASYTVKSGDTLSAIAKNHKTTVQELVSLNSISNADVISIGDVLKLDNSTASQAEAKSQPT 
IEN SMN S S SNLS SSDSAAKEE IARRESNGSYT AQNGQYYGRYQLSQS YLNGDLS PENQEK 
VADNYVASRYGSWS AALS FWNSNGWY 

>SEQ ID NO 3151:15_18RS21 frame: X 

ASYTVKSGDTLSAIAKNHKTTVQELV3LWSISNADVISIGDVLKLDNSTASQAEAKSQPT 
IENSMNSS SNLS SSDSAAKEEIARRESNGSYTAQNGQYYGRYQLSQS YLNGDLS PENQEK 

VADNYWSRYGSW S AALSFWNSNGW Y 
>SfeQ ID NO 3152:15 2603 frame: 1 

AS YTVKSGDTLSAI AKNHKTTVQELVSLNS I SNADVI SIGDVLKLDNSTAS QAEAKSQPT 
IENSMNSSSNLSSS DSAAKEEIARRE SNGSYTAQNGQYYGRYQLSQS YLNGDLS PENQEK 
VADNYWSRYGSWSAALSFWNSNGWY 

>SEQ ID NO 3153:15_090 frame: 1 

AS YTVKSGDTLSAIAKNHKTTVQELVSLNSI SNADVI S IGDVLKLDNSKASQAEAKS QPT 
lENSMNSSSNLSSSDSAAKEEIARRESNGSYTAQNGQYYGRYQIiSQSYLNGDLSPENQEK 

VADNYWSRYGSWSAALSFWNSNGWY 
>SEQ ID NO 3154:15_A909 frame: 1 

ASYTVKSGDTLSAIAKNHKTTVQELVSLNSISNADVISIGDVLKLDNSTASQAEAKSQPT 
IENSMNSSSNLSSSDSAAKEE IARRESNGSYT AQNGQYYGRYQLSQS YLNGDLS PENQEK 

VADNYVASRYGSWSAALS FWNSNGWY 
>SEQ ID NO 3155:15_CJB110 frame: 1 

ASYTVKSGDTLSAIAKNHKTTVQELVSLNSISNADVISIGDVLBCLDNSKASQAEAKSQPT 
IENSMNSSSNLSSSDSAAKEEIARRESNGSYTAQNGQYYGRYQLSQSYLNGDLSPENQEK 

VADNYWSRYGSWSAALSFWNSNGWY 
>SEQ ID NO 3156:15_COHl frame: 1 

AS YTVKSGDTLSAI AKNHKTTVQ „ LVSLN S I SNADVI S IGDVLKLDNSTAS QAEAKSQPT 
IENSMNSSSNLSSSDSAAKEEIARRESNGSYTAQNGQYYGRYQLSQSYLNGDLSPENQEK 

VADNYVASRYGSWSAALSFWNSNGWY 
>SEQ ID NO 3157:15 H36B frame: 1 

ASYTVKSGDTLSAI AKNHKTTVQE LVSLN S ISNADVI S IGDVLKLDNSTASQAEAKSQPT 
IENSMNSSSNLSSSDSAAKEEIARRESNGSYTAQNGQYYGRYQLSQSYLNGDLSPENQEK 

VADNYVASRYGSWSAALSFWNSNGWY 
>SEQ ID NO 3158:15_JM9130013 frame: 1 

ASYTVKSGDTLSAIAKNHKTTVQELVSLNS I SNADVI SIGDVLKLDNSTTSQAEAKSQPT 
IENSMNSSSNLSSSDSAAKEEIARRESNGSYTAQNGQYYGRYQLSQSYLNGDLSPENQEK 

VADNYVASRYGSWSAALSFWNSNGWY 
>SEQ ID NO 3159:15_M732 frame: 1 

ASYTVKSGDTLSAIAKNHKTTVQ . LVSLNS I SNADVI S IGDVLKLDNSTASQAEAKSQPT 
IENSMNSSSNLSSSDSAAKEEIARRESNGSYTAQNGQYYGRYQLSQSYLNGDLS PENQEK 

VADNYVASRYGSWSAALSFWNSNGWY 
>SEQ ID NO 3160:15 M781 frame: 1 • 

ASYTVKSGDTLSAIAKNHKTTVQ . LVSLNS I SNADVI S IGDVLKLDNSTASQAEAKSQPT 
IENSMNSSSNLSSSDSAAKEEIARRESNGSYTAQNGQYYGRYQLSQSYLNGDLSPENQEK 

VADNYVASRYGSWSAALSFWNSNGWY 
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Table 31: Comparative Sequences relating to SAG2148 
(LysM domain protein) 

SEQ3150 ASYTVKSGDTLSAI AKN HKTTVQELVSLN SI SNADVI S IGDVLKLDNSTASQAEAKSQPT 

SEQ3151 ASYTVKSGDTLSAIAKNHKTTVQELVSLNSI SNADVI S IGDVLKLDNSTASQAEAKSQPT 

SEQ3152 AS YTVKSGDTLS AI AKNHKTT VQELVSLN S I SNADVT S IGDVLKLDN ST ASQAEAKSQPT 

SEQ3153 ASYTVKSGDTLSAIAKNHKTTVQELVSLNSISNADVISIGDVLKLDNSKASQAEAKSQPT 

SEQ3154 ASYTVKSGDTLSAIAKNHKTTVQELVSLNSISNADVISIGDVLKLDNSTASQAEAKSQPT 

SEQ3155 ASYTVKSGDTLSAIAKNHKTTVQELVSLNSISNADVISIGDVLKLDNSKASQAEAKSQP^ 

SEQ3156 ASYTVKSGDTLSAIAKNHKTTVQ-LVSLNS I SNADVI SIGDVLKLDNSTASQAEAKSQPT 

SEQ3157 ASYTVKSGDTLSAIAKNHKTTVQELVSLNSISNADVISIGDVLKLDNSTASQAEAKSQPT 

SEQ3158 ASYTVKSGDTLS AIAKNHKTTVQELVSLNS I SNADVI S IGDVLKLDNSTTSQAEAKSQPT 

SEQ3159 AS YTVKSGDTLSAI AKNHKTTVQ-LVSLNS I SNADVI S IGDVLKLDN ST ASQAEAKSQPT 

SEQ3160 ASYTVKSGDTLSAIAKNHKTTVQ-LVSLNSISNADVISIGDVLKLDNSTASQAEAKSQPT 

SEQ3150 IENSMNSSSNLSSSDSAAKEEIARRESNGSYTAQNGQYYGRYQLSQSYLNGDLSPENQEK 

SEQ3151 IEN SMN S S SNLS S S DSAAKEE I ARRE SNGSYTAQNGQYYGRYQLSQS YLNGDLS PENQEK 

SEQ3152 IENSMNS S SNLSS S DSAAKEE IARRESNGS YTAQNGQYYGRYQLSQS YLNGDL S PENQEK 

SEQ3153 IENSMNS S SNLSS S DSAAKEE IARRESNGSYTAQNGQYYGRYQLSQSYLNGDLS PENQEK 

SEQ3154 IENSMNS S SNLS S S DSAAKEE IARRE SNGSYTAQNGQYYGRYQLSQS YLNGDLS PENQEK 

SEQ3155 IENSMNS S SNLS S3 DSAAKEE IARRE SNGSYTAQNGQYYGRYQLSQS YLNGDLS PENQEK 

SEQ3156 IENSMNS SSNLSSSDSAAKEEIARRESNGSYTAQNGQYYGRYQLSQSYLNGDLS PENQEK 

SEQ3157 IEN SMNSS SNLS SSDSAAKEEIARRESNGS YTAQNGQYYGRYQLSQS YLNGDLS PENQEK 

SEQ3158 IEN SMNSSSNLSSSDSAAKEE IARRE SNGSYTAQNGQYYGRYQLSQSYLNGDLSPENQEK 

SEQ3159 I ENSMNSS SNLS SSDSAAKEEIARRE SNGSYTAQNGQYYGRYQLSQS YLNGDLS PENQEK 

SEQ3160 IENSMNS S SNLS SS DSAAKEE IARRESNGS YTAQNGQYYGRYQLSQS YLNGDLS PENQEK 

SEQ3150 VADNYVASR YG S W S AALS FWN SNGW Y 

SEQ3151 VADNYWSRYGSWSAALSFWNSNGWY 

SEQ3XS2 VADNYWSRYGSWSAALS FWNSNGWY 

SEQ3153 VADNYWSRYGSWSAALSFWNSNGWY 

SEQ3154 VADNYVASRYGSWSAALS FWNSNGWY 

SEQ3155 VADNYWSRYGSWSAALSFWNSNGWY 

SEQ3156 VADNYVASRYGSWSAALS FWNSNGWY 

SEQ31 57 VADNYVASRYGSWSAALS FWNSNGWY 

8EQ3158 VADNYVASRYGSWSAALS FWNSNGWY 

SEQ3159 VADNYVASRYGSWSAALS FWNSNGWY 

SEQ3160 VADNYVASRYGSWSAALS FWNSNGWY 
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Table 32: C nv rsion of ORF Ref Nos. with SAG Ref N s. 



ORFRefN . 


SAGxxxx Ref No. 


aa / 


Annotation 


ORF00003 


SAG0017 


447 


PcsB protein 


ORF00004 1 


SAG0018 


322 


ribose-phosphate pyrophosphokinase 


ORF00005 


SAG0019 


391 


aminotransferase, class I 


ORF00006 


SAG0020 


253 


recombination protein O 


ORF00008 


SAG0021 


283 


protease, putative 


ORF00009 1 


SAG0O22 


330 


fatty acid/phospholipid synthesis protein PIsX 


ORF00010 j 


SAG0023 


79 


acyl carrier protein 


ORFQ001 1 I 


SAG0024 


234 

t 


phosphoribosylaminoimldazole-succinocarboxarnide 
synthase 


ORF00012 


SAG0025 


1241 


phosphoribosylformylgiycinamidine synthase, putative 


ORF00013 | 


SAG0026 


484 


amidophosphoribosyitransferase 


ORF00014 I 


SAG0027 


340 


phosphoribosylformylglycinamidine cyclo-ligase 


ORF00015 I 

\S * »1 WWW 1 W B 


SAG0028 


182 


phosphoribosylglydnamideformyltransferase 


ORFG0018 ! 

\fw At Www IV 1 


SAG0029 


250 


acetyitransferase, GNAT family 


ORF00017 S 


SAG0030 


515 


phosphoribosylaminoimidazolecarboxamide 
formyltransferase/IMP cydohydrolase 


ORF00018 | 


SAG0031 


i 283 


peptidase, M23/M37 family 


ORF00020 j 


SAG0032 


434 


group B streptococcal surface immunogenic protein 


ORF00021 


SAG0033 


232 


N-acetylmannosamine-6-P epimerase, putative I 


I ORF00022 | 


SAG0034 


438 


sugar ABC transporter, sugar-binding protein 


ORF00023 j 


SAG0035 


295 


sugar ABC transporter, permease protein 


ORF00024 ; 


SAG0036 


276 


sugar ABC transporter, permease protein 


ORF00025 i 


SAG0037 


147 


conserved hypothetical protein 


ORF00026 I 


SAG0038 


220 


conserved hypothetical protein j 




OnvJUUOO 


305 


N-acetvlneuraminate Ivase. outative 


VJlArUULrZO 


OA finnan 




ROK fa mil v Drotein 


UKruuuty 




325 


acetvl xvlan esterase, outative 


VJrxrUUUOU 




267 


phosphosugar-blnding transcriptional regulator, RpiR 
family, putative 


ORF00031 


J SAG0043 


421 


phosphoribosylamine-glycine ligase 






162 


phosphoribosylaminoimidazole carboxylase, catalytic 
subunit 


ORF00033 


j SAG0045 


363 


phosphoribosyiaminoimidazole carboxylase, ATPase 
subunit 


ORF00035 


SAG0046 


463 


hypothetical protein 


ORF00036 


j SAG0047 


432 


adenylosuccinate lyase 


ORF00037 


SAG0048 


303 


transcriptional regulator, Cro/Cl family 


ORF00038 


SAG0049 


332 


Holliday junction DNA helicase RuvB 


ORF00039 


SAG0050 


145 


phosphotyroslne protein phosphatase, low molecular 
weight 


ORF00040 


j SAG0051 


126 


MORN motif family protein 


ORF00041 


! SAG0052 


592 


membrane protein, putative 


ORF00042 


( SAG0053 


| 880 


aldehyde-alcohol dehydrogenase 


ORF00043 


SAG0054 


338 


alcohol dehydrogenase, propanol-preferring 


ORF00044 


| SAG0055 


496 


threonine synthase 


ORF00045 


I SAG0056 


412 


MATE efflux family protein 


ORF00046 


SAG0057 


102 


ribosornal protein S10 


ORF00047 


SAG0058 


208 


ribosomal protein L3 


ORF00048 


j SAG0059 


207 


ribosornal protein L4 


ORF00049 


SAG0060 




ribosomal protein L23 


ORF00050 


SAG0061 


277 


ribosomal protein 12 


ORF00052 


I SAG0062 


92 


ribosomal protein S19 


ORF00054 


8AG0063 


I 114 


ribosomal protein L22 


ORF00055 


| SAG0064 


217 


ribosomal protein S3 



Tabl 32: Conversi n fORFRefN s. with SAG R fN s. 



| ORFRefN . 1 


SAGxxxx Ref No. I 


aa \A 


annotation I 


1 ORF00058 


SAG0065 


137 I 


ibosomal protein L16 | 


1 ORF00058 ( 


SAG0066 


68 f 


ibosomal protein L29 


I ORF00059 j 


SAG0067 


86 i 


ibosomal protein S17 j 


ORF00060 


SAG0068 | 


122 1 1 


ibosomal protein L1 4 j 


ORF00061 


SAG0069 


101 i 


ibosomal protein L24 I 


ORF00063 | 


SAG0070 


180 i 


ribosomal protein L5 I 


ORF0Q064 


SAG0071 | 


61 


ribosomal protein S14, putative | 


ORF00065 


SAG0072 


132 j 


ribosomal protein S8 I 


i ORF00066 | 


SAG0073 \ 


178 


ribosomal protein L6 


I ORF00068 j 


SAG0074 


118 


ribosomal protein L18 


ORF00069 


SAG0075 


164 S 


ribosomal protein S5 j 


I ORF00070 


SAG0076 | 


59 


ribosomal protein L30 J 


ORF00071 


SAG0077 I 


146 


ribosomal protein L15 J 


ORF00072 


SAG0078 


434 I 


preprotein translocase. SecY subunit j 


ORF00073 j 


SAG0079 


212 I 


adenylate kinase I 


ORF00074 J 


SAG0080 


72 


translation initiation factor IF-1 I 


ORF00075 I 


SAG0081 | 


38 


ribosomal protein L36 I 


ORF00077 j 


SAG0082 


121 


ribosomal protein S1 3 | 


ORF00078 


SAG0083 


118 


ribosomal protein S1 1 I 


ORF00080 | 


SAG0084 


312 


DNA-directed RNA polymerase, alpha subunit I 


I ORF00081 


SAG0085 


128 I 


ribosomal protein L17 j 


ORF00087 j 


SAG0086 


97 


hypothetical protein I 


I ORF00088 


I SAG0087 | 


59 I 


hypothetical protein _J 


I ORF00089 


S SAG0088 I 


56 ( 


| hypothetical protein I 


f ORF00090 < 

S %f • > 1 WW www 


| SAG0089 


183 | 


conserved hypothetical protein I 


1 ORF00091 

I 1 M WWWW 1 


| SAG0090 j 


i 139 


conserved hypothetical protein I 


1 ORF00093 

■ « wwwww 


| SAG0091 j 


144 


I transcriptional regulator ComX1 v putative § 


ORF00094 


| SAG0092 


230 ! 


phosphoglycerate mutase family protein I 


1 ORF00095 


| SAG0093 


250 


D-alanyl-D-alanine carboxypeptidase family protein I 
I ^ | 


1 ORF00098 
| 


i SAG0094 


j 191 


j N-acetylmuramoyl-L-alanlne amidase, family 4 protein I 
I _ .... 


ORF00097 


I SAG0095 


i 344 


heat-inducibte transcription repressor HrcA J 


1 ORF00098 


I SAG0096 


190 


heat shock protein GrpE 


I ORF00099 


I SAG0097 


609 


I dnaK protein __J 


ORF00100 


I SAG0098 


I 379 


j dnaJ protein I 


ORF00101 


i SAG0099 


415 


I transcriptional regulator, GntR family I 


} ORF00102 


SAG0100 


{ 258 


I tRNA pseudouridine synthase A | 


(. ORF00103 


SAG0101 


252 


I phosphomethylpyrimldine kinase, putative I 


ORF00104 


SAG0102 


! 154 


conserved hypothetical protein _| 


i ORF00105 


[ SAG0103 


I 189 


j conserved hypothetical protein j 


ORF00106 


SAG0104 


280 


J conserved hypothetical protein ] 


j ORF00107 


I SAG0105 


427 


j trigger factor | 


\ ORF00108 


SAG0106 


191 


1 DNA-directed RNA polymerase, delta subunit, putative j 


ORF00109 


I SAG0107 


534 


1 CtP synthase 1 


ORF00110 


I SAG0108 


I 308 


| conserved hypothetical protein 1 


( ORF00111 


SAG0109 


148 


j deoxyuridine 5 % -triphosphate nucleotidohydrolase 1 


ORF00112 


I SAG0110 


454 


1 DNA repair protein RadA 


ORF00113 


SAG0111 


I 165 


| carbonic anhydrase-related protein 


ORF00115 


SAG0112 


439 


J pyridine nudeotide-disulphide oxidoreductase family 
Iprotein 1 


) ORF00116 


) SAG0113 


484 


j glutamyl-tRNA synthetase ] 


ORF00117 


SAG0114 


322 


1 ribose ABC transporter, periplasmic D-ribose-binding 1 
[protein | 
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Tabl 32: Conv rsion of ORF Ref Nos. with SAG R fNos. 



ORFRefNo. | SAGxxxx Ref No. I aa [Annotation 



ORF00118 



ORF00119 



SAG0115 



310 



ribose ABC transporter, permease protein 



SAG0116 



492 



ribos ABC transporter, ATP-binding protein 



ORF00120 



SAG0117 



132 



ribose ABC transporter protein RbsD 



ORF00121 



SAG0118 



303 



ribokinase 



ORF00122 



ORF00123 



ORF00124 



SAG0119 



328 



ribose operon repressor RbsR 



SAG0120 



32 



hypothetical protein 



SAG0121 



362 



permease, putative 



ORF00125 



SAG0122 



228 



ABC transporter, ATP-binding protein 



ORF00126 



ORF00128 



ORF00129 



ORF00130 



SAG0123 



223 



DNA-binding response regulator 



SAG0124 



356 



sensor histidine kinase 



SAG0125 



396 



argininosuccinate synthase 



SAG0126 



462 



argininosuccinate lyase 



ORF00131 



SAG0127 



293 



fructose-bisphosphate aldolase 



ORF00132 



SAG0128 



305 



L-2-hydroxyisocaproate dehydrogenase 



ORF00133 



SAG0129 



62 



ribosomal protein L28 



ORF00134 



ORF00135 



SAG0130 



121 



conserved hypothetical protein 



SAG0131 



543 



DAK2 domain protein 



ORF00136 



SAG0132 



294 



SPFH domain/Band 7 family protein 



ORF00137 



SAG0133 



38 



conserved hypothetical protein 



ORF00138 



SAG0134 



96 



hypothetical protein 



ORF00141 



SAG0135 



246 



amino add ABC transporter, ATP-binding protein 



ORF00142 



ORF00143 



SAG0136 



516 



SAG0137 



627 



amino acid ABC transporter, amino acid-binding 
protein/permease protein 



conserved hypothetical protein 



ORF00145 



SAG0138 



279 



undecaprenol kinase, putative 



ORF00146 



SAG0139 



251 



negative regulator of competence MecA, putative 



ORF00148 



SAG0140 



386 



glycosyl transferase, group 4 family protein 



ORF00149 



SAG0141 



256 



ABC transporter, ATP-binding protein 



ORF00150 



SAG0142 



420 



conserved hypothetical protein 



ORF00151 



ORF00152 



ORF00153 



ORF00154 



ORF00155 



SAG0143 



410 



SAG0144 



147 



selenocysteine lyase 
NifU family protein 



SAG0145 



472 



conserved hypothetical protein 



SAG0146 



395 



SAG0147 



411 



penicillin-binding protein 4, putative 



P-alanyl-D-alanlne carboxypeptidase 



ORF00156 



SAG0148 



551 



oligopeptide ABC transporter, substrate binding protein,] 
putative 



ORF00157 



SAG0149 



304 



oligopeptide ABC transporter, permease protein 



ORF00158 



SAG0150 



343 



oligopeptide ABC transporter, permease protein 



ORF00160 



SAG0151 



348 



oligopeptide ABC transporter, ATP-binding protein 



ORF00161 



SAG0152 



310 



oligopeptide ABC transporter, ATP-binding protein 



ORF00166 



SAG0153 



283 



4-diphosphocytidyl-2C-methyl-D-erythritol kinase 



ORF00167 



SAG0154 



147 



adc operon repressor AdcR 



ORF00168 



ORF00169 



ORF00172 



SAG0155 



236 



SAG0156 



270 



zinc ABC transporter, ATP-binding protein 
zinc ABC transporter, permease protein 



SAG0158 



419 



tyrosyMRNA synthetase 



ORF00173 



SAG0159 



765 



peniciiiin-binding protein 1B, putative 



ORF00174 



SAG0160 



1191 



DNA-directed RNA polymerase, beta subunit 



ORF00176 



SAG0161 



1216 



DNA-directed RNA polymerase beta' subunit 



ORF00178 



SAG0162 



121 



conserved hypothetical protein 



ORF00179 



SAG0163 



323 



competence protein CgIA 



ORF00180 



SAG0164 



282 



competence protein CgIB 



ORF00181 



SAG0165 



151 



conserved hypothetical protein" 



ORF00182 



SAG0166 



123 



conserved domain protein 



Table 32: Conversi n f ORF R f Nos. with SAG R f N s. 



ORF Ref No. 1 I 


S AGxxxx Ref N . I 


aa |A 


^notation I 


ORF00183 


SAG0167 


324 ( 


unserved hypothetical protein ] 


ORF00184 l 


SAG0168 I 


397 \i 


icetate kinase ___J 


ORF00186 


SAG0.169 


68 t 


ranscriptional regulator, Cro/CI family 


ORF00187 


SAG0170 


45 


hypothetical protein 


ORF00188 


SAG0171 


151 


hypothetical protein I 


ORF00189 


SAG0172 


221 


protease, putative 1 


ORF00190 


SAG0173 


256 


pyrroline-5-carboxylate reductase 1 


ORF00191 


SAG0174 


355 


glutamyl-aminopeptidase J 


ORF00192 


SAG0175 


79 


hypothetical protein 1 


ORF00193 | 


SAG0176 


94 


conserved hypothetical protein J 


ORF00194 


SAG0177 


107 | 


thioredoxin family protein J 


ORF00195 


SAG0178 


208 


tRNA binding domain protein 


ORF00196 I 


SAG0179 ; 


238 


conserved hypothetical protein 1 


ORF00198 | 


SAG0180 


131 


single-strand binding protein 


ORF00199 I 


SAG0181 


214 I 


hydrolase, haloacid dehalogenase-Hke family 1 


ORF00200 1 


SAG0182 


581 I 


sensor histidine kinase, putative 1 


! ORF00201 I 


SAG0183 I 


246 


response regulator 1 


! ORF00203 I 


8AG0184 f 


151 | 


conserved hypothetical protein 1 


ORF00204 1 


SAG0185 | 


242 I 


membrane protein, putative 1 


ORF00205 1 


SAG0186 I 


36 I 


hypothetical protein 1 


ORF00206 I 


SAG0187 I 


542 


oligopeptide ABC transporter, oligopeptide-binding 
protein 


ORF00207 1 


SAG0188 


325 


oligopeptide ABC transporter, permease protein 1 


ORF00208 


SAG0 189 


273 


oligopeptide ABC transporter, permease protein 1 


ORF00209 I 


SAG0190 1 


267 I 


peptide ABC transporter, ATP-binding protein 


ORF00210 [ 


SAG0191 


208 


peptide ABC transporter, ATP-binding protein 


ORF00211 


SAG0192 


676 


I PTS system, IIABC components j 


ORF00212 


SAG0193 


I 541 


alpha amylase family protein J 


ORF00214 


S SAG0194 


639 


transcriptional antiterminator, BgIG family j 


ORF00216 


SAG0195 


377 


IS1548, transposase J 


ORF00217 


i SAG0196 


I 66 


| conserved domain protein J 


ORF00218 


| SAG0197 


94 


PTS system, IIB component, putative | 


ORF00219 


i SAG0198 


!i 451 


| PTS system, IIC component, putative | 


ORF00220 


f SAG0199 


| 285 


I transketolase, N-terminal subunit 


ORF00221 


[ SAG0200 


| 309 


j transketolase, C-terminal subunit I 


ORF00223 


SAG0201 


419 


oxidoreductase, putative I 


ORF00224 


SAG0202 


\ 89 


I ribosomal protein S15 j 


ORF00225 


SAG0203 


709 


| polyribonucleotide nucleotidyltransferase 


ORF00226 


! SAG0204 


j 250 


| conserved hypothetical protein J_J 


ORF00227 


SAG0205 


194 


I serine O-acetyltransferase I 


ORF00228 


! SAG0206 


I 60 


I hypothetical protein I 


ORF00229 


SAG0207 


I 447 


I cysteinyl-tRNA synthetase 


ORF00230 


| SAG0208 


128 


I conserved hypothetical protein | 


ORF00231 


SAG0209 


251 


I RNA methyltransferase, TrmH family, group 3 


ORF00232 


SAG0210 


I 172 


conserved hypothetical protein 


ORF00233 


! SAG0211 


I 286 


I DegV family protein _J 


ORF00234 


j SAG0212 


I 32 


1 hypothetical protein | 


ORF00235 


SAG0213 


! 39 


1 hypothetical protein 1 


ORF00236 


SAG0214 


148 


1 ribosomal protein L13 J 


ORF00237 


I SAG0215 


j 130 


ribosomal protein S9 I 


ORF00238 


S SAG0216 


33 


I hypothetical protein I 


ORF00239 


| SAG0217 


384 


site-specific recombinase, phage integrase family J 


ORF00240 


SAG0218 


158 


j transcriptional regulator, Cro/CI family 


ORF00241 


I SAG0219 


I 101 


I hypothetical protein _J 



Tabl 32: Conv rslon f ORF Reff Nos. with SAG R f Nos. 



I ORF Ret No. ! 


SAGxxxx R f No. I 


aa \A 


ktinotati n I 


ORF00242 


SAG0220 I 


92 < 


conserved hypothetical protein _J 


ORF00243 


SAG0221 I 


76 l 


hypothetical protein | 


1 ORF00244 


SAG0222 


108 < 


conserved domain protein J 


1 ORF00245 


SAG0223 


209 < 


conserved hypothetical protein, fusion J 


ORF00246 


SAG0224 


332 


eplication initiation protein, putative I 


ORF00247 


SAG0225 


144 


hypothetical protein I 


1 ORFQ0248 


SAG0226 


418 


recombination protein | 


1 ORF00249 


SAG0227 


156 


hypothetical protein I 


ORF00250 


SAG0228 | 


111 


conserved hypothetical protein j 


1 ORF00251 


SAG0229 I 


95 


conserved hypothetical protein I 


ORF00252 


SAG0230 S 


96 


conserved hypothetical protein I 


ORF00253 I 


SAG0231 


135 


hypothetical protein | 


i ORF00254 I 


SAG0232 


186 


hypothetical protein I 


I ORF00255 


SAG0233 I 


226 


hypothetical protein 1 


I ORF0025ft I 


SAG0234 


128 


hypothetical protein 1 


1 ORTO0257 S 


SAG0235 I 


93 I 


hypothetical protein „ j 


1 nRFoo?fia i 


SAG0236 I 


32 | 


hypothetical protein 1 


1 nRPnn2^Q I 


SAG0237 I 


34 | 


hypothetical protein | 


1 oRpnn9Rn 1 


SAG0238 I 


41 | 


hypothetical protein 1 


I UiArUUAO 1 1] 


SAG0239 I 


286 | 


transcriptional regulator MutR family _j 


1 ORPftOOfi^ 1 


SAG0240 1 


393 


transporter, putative I 


1 r*Rpnn?fi3 


SAG0241 


213 


amino acid ABC transporter, permease protein I 




SAG0242 I 


308 


amino acid ABC transporter, amino acid-binding ! 






protein 


ORF00265 


SAG0243 


211 


amino acid ABC transporter, permease protein f 


1 ORF00266 


SAG0244 


381 


| amino acid ABC transporter, ATP-binding protein 


1 ORF00272 


1 SAG0245 } 


152 


| hypothetical protein I 


ORF00273 


I SAG0246 


268 


hypothetical protein I 


I ORF00274 


SAG0247 j 


116 


hypothetical protein 1 


1 ORF00275 


1 SAG0248 


! 90 


1 hypothetical protein f 


if ORF00276 


| SAG0249 


116 


1 hypothetical protein _J 


ORF00278 


i SAG0250 


193 


1 hypothetical protein 1 


ORF00279 


i SAG0251 


I 72 


1 transcriptional regulator, Cro/Cl family I 


ORF00280 


SAG0252 


{ 186 


1 acetyltransferase, GNAT family Jj 


5 ORF00281 


SAG0253 


I 192 


acetyltransferase, GNAT family I 


| ORF00282 


| SAG0254 


I 226 


acetyltransferase, GNAT family 1 


I ORF002B3 


SAG0255 


! 315 


1 conserved hypothetical protein J 


| ORF00284 


SAG0256 


j 163 


1 RNA polymerase sigma factor, ECF subfamily j 


I ORF00285 


i SAG0257 


53 


1 hypothetical protein j 


\ ORF00287 


| SAG0258 


! 202 


1 transcriptional regulator, TetR femily | 


I ORF00288 


SAG0259 


365 


I ABC transporter efflux protein, DrrB family, putative 
1 


I ORF00289 


SAG0260 


I 238 


ABC transporter, ATP-binding protein _J 


| ORF00290 


I SAG0261 


129 


1 IS1381 , transposase OrfB I 


i ORF00291 


: SAG0262 


127 


I IS1 381 , transposase OrfA I 


ORF00292 


5 SAG0263 


171 


I hypothetical protein . I 


I ORF00293 


I SAG0264 


I 103 


1 conserveo nypointuicai pit/win 


j ORF00294 


SAG0265 


| 235 


1 conserved hypothetical protein I 


ORF00295 


j SAG0266 


! 382 


N-acetylglucosamlne-6-phosphate deacetylase 


ORF00296 


SAG0267 


180 


1 conserved hypothetical protein 1 


; ORF00297 


SAG0268 


304 


1 glycyl-tRNA synthetase, alpha subunit _j 


ORF00298 


SAG0269 


j 213 


acyl carrier protein phosphodiesterase, putative 1 


i ORF00299 


5 SAG0270 


679 


glycyl-tRNA synthetase, beta subunit 


ORF00300 


SAG0271 


* 85 


1 conserved hypothetical protein 1 


I ORF00301 


j SAG0272 


I 67 


1 membrane protein, putative | 
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Tabl 32: Conversion of ORF Ref Nos. with SAG R f 



ORF Ref No, 
ORF00302 



SAGxxxx Ref No. 



SAG0273 



aa Annotati n 



502 1 glycerol kinase 



ORF00303 



SAG0274 



609 I alpha-glycerophosphate xldase 



ORF00304 



SAG0275 



232 1 glycerol uptake facilitator protein 



ORF00305 



SAG0276 



445 I NADH oxidase, putative 



ORF00306 



ORF00307 



ORF00308 



SAG0277 



476 I conserved hypotheticalproteln 



SAG0278 



661 Itransketolase 



SAG0279 



101 I conserved hypothetical protein 



ORF00309 



SAG0280 



244 I ABC transporter, ATP-binding protein 



ORF00310 



ORF00313 



ORF00314 



SAG0281 



534 | membrane protein, putative 



SAG0282 



461 I PTS system, IIBC components 



8AG0283 



267 | glutamate Sjcinase 



ORF00315 



SAG0284 



417 I gamma-glutamyl phosphate reductase 



ORF00316 



ORF00317 



ORF00318 



ORF00319 



SAG0285 



298 I conserved hypothetical protein TIGR00006 



SAG0286 



108 | cell division protein FtsL. putative 



SAG0287 



752 | penicillin-binding protein 2X 



SAG0288 



336 I phospho-N-acetylmuramoyl-pentapeptide-transferase 



ORF00320 



ORF00321 



SAG0289 



SAG0290 



447 I ATP-dependent RNA helicase. DEAD/DEAH box 
family 



270 I ABC transporter, substrate-binding protein 



ORF00322 



SAG0291 



267 I amino acid ABC transporter, permease protein 



ORF00323 



ORF00324 



ORF00325 



SAG0292 



247 I amino acid ABC transporter, ATP-binding protein 



SAG0293 



74 | conserved hypothetical protein 



SAG0294 



304 I thioredoxin reductase 



ORF00326 



SAG0295 



486 I conserved hypothetical protein 



ORF00327 



SAG0296 



273 I NAD synthetase 



ORF00328 



ORF00329 



SAG0297 



444 | aminopeptidase C 



SAG0298 



750 I penicillin-binding protein 1A 



ORF00330 



ORF00331 



ORF00332 



SAG0299 



199 I recombination protein U 



SAG0300 



172 I conserved hypothetical protein 



SAG0301 



40 | hypothetical protein 



ORF00333 



SAG0302 



110 | conserved hypothetical protein 



ORF00335 



SAG0303 



384 l conserved hypothetical protein 



ORF00336 



SAG0304 



487 I conserved hypothetical protein 



ORF00337 



SAG0305 



160 I autoinduoer-2 production protein LuxS 



ORF00338 



SAG0306 



535 | KH domain protein 



ORF00340 



SAG0307 



"33 I hypothetical protein 



ORF00341 



SAG0308 



ABC transporter, ATP-binding protein, FRAMESHIFT 



ORF00343 



SAG0309 



246 I ABC transporter, permease protein, putative 



ORF00344 



SAG0310 



361 



I conserved hypothetical protein 

I DNA-binding response regulator POINT MUTATION 



ORF00345 



ORF00347 



ORF00348 



ORF00349 



SAG0311 



SAG0312 



234 | conserved hypothetical protein 



SAG0313 



209 I guanylate kinase 



SAG0314 



1 04 I DNA-directed RNA polymerase, omega subunit, 
I putative 



ORF00350 



SAG0315 



796 | primosomal protein N' 



ORF00351 



SAG0316 



31 1 I methionyMRNA formyltransferase 



ORF00352 



ORF00353 



SAG0317 



440 I Sun protein 



SAG0318 



245 I serine/threonine phosphatase, putative 



ORF00354 



SAG0319 



651 I serine/threonine protein kinase 



ORF00355 



SAG0320 



231 I conserved hypothetical protein 



ORF00356 



ORF00358 



SAG0321 



339 l sensor hlstidine kinase, putative 



SAG0322 



213 I DNA-blnding response regulator 
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Table 32: Conv rsion of ORF Ref Nos. with SAG R f 



ORF Ref No. I SAGxxxx Ref No. 1 aa I Ann tati n 



ORF00359 



ORF00360 



ORF00361 



SAG0323 S 466 | hydrolas , hatoatid dehalogenase family/peptidyl- 
I prolyl cis-trans isomerase, cyclophilin type 



SAG0324 



124 | general stress protein, putative 



SAG0325 



258 I pyruvate formate-lyase-activating enzyme 



ORF00362 



SAG0326 



"261 1 transcriptional regulator, DeoR family 



ORF00363 



SAG0327 



327 I transcriptional regulator, putative 



ORF00364 



ORF00366 



S AG0328 | 107 | PTS system, ceilobiose-specific II A component 
SAG0329 I 106 | PTS system, cellobiose-specific I IB component 



ORF00367 



SAG0330 



"433 I PTS system, cellobiose-specific 1IC component 



ORF00368 



SAG0331 



81 8 I formate acetyltransferase 



ORF00369 



SAG0332 



222 I transaldolase family protein 



ORF00371 



SAG0333 



362 I glycerol dehydrogenase 



ORF00372 



SAG0334 



308 | cysteine synthase A 



ORF00373 



ORF00374 



"SAG0335 I 214 I conserved hypothetical protein TIGR00257 
SAG0336 I 429 I hellcase, putative 



ORF00375 



SAG0337 



221 I competence protein F, putative 



ORF00376 



SAG0338 



184 I ribosomai subunit interface protein 



ORF00382 



ORF00383 



ORF00384 



SAG0339 



450 I aspartate kinase family protein 



SAG0340 



216 l hydrolase, haloacid dehalogenase-like family 



SAG0341 I 49 I hypothetical protein 

SAG0342 I 263 I en oyl-CoA hydratase/isomerase family proteirf 



ORF00385 



ORF00386 



SAG0343 



1 44 I transcriptional regulator, MarR family 



ORF00387 



SAG0344 



"323 I 3-oxoacyl-(acyl-carrier-protein) synthase HI 



ORF00388 



SAG0345 



74 | acyl carrier protein 



ORF00390 



SAG0346 



"319 | enoyHacyl-carrier-protein) reductase II 



ORF00391 



SAG0347 



308 I malonyl CoA-acyl carrier protein transacylase 



ORF00392 



SAG0348 



244 I 3-oxoacyHacyl-carrier protein] reductase 



ORF00393 



SAG0349 l 410 l 3-oxoacyMacyl-carrier-protein) synthase II 



ORF00394 



SAG0350 166 acetyl-CoA carboxylase, biotin carboxyl carrier protein 



ORF00395 



SAG0351 i 140 | (3R)-hydroxymyristoyl-(acyl-carrier-protein) 
[dehydratase 



ORF00396 



SAG0352 



456 



acetyl-CoA carboxylase, biotin carboxylase 

acetyl-CoA carboxylase, carboxyl transferase, beta 



ORF00397 



SAG0353 



291 



subunit 



ORF00398 



SAG0354 I 257 acetyl-CoA carboxylase, carboxyl transferase, alpha 
I subunit 



ORF00399 
ORF00400 



SAG0355 



210 I conserved hypothetical protein 



SAG0356 



425 | seryltRNA synthetase 



ORF00402 



SAG0357 



330 | hypothetical protein 



ORF00403 



SAG0358 



120 I conserved hypothetical protein 



ORF00404 



SAG03 59 I 303 I PTS system, mannose-specific HP component 
SAG0360 I 27 0 | PTS system, mannose-specific HC component 



ORF00405 



ORF00406 



SAG0361 



336 I PTS system, mannose-specific HAB components 



ORF00407 



SAG0362 



270 | hydrolase, haloacid dehalogenase-like family 



ORF00408 



SAG0363 



194 I hypothetical protein 



ORF00409 



SAG0364 



203 1 membrane protein, putative 



ORF00410 



ORF00411 



SAG0365 



473 | xanthine/uracil permease family protein 



SAG0366 



169 conserved hypothetical protein TIGR00150, putative 



ORF00412 



ORF00413 



SAG0367 



1 86 I acetyltransferase, GNAT family 



SAG0366 



435 | transcriptional regulator, putative 



ORF00414 



SAG0369 



US I conserved hypothetical protein 



ORF00415 



ORF00416 



SAG0370 



139 | HIT family protein 



SAG0371 



167 | hypothetical protein 



ORF00417 



SAG0372 



85 | hypothetical protein 
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Tabl 32: C nv rsion fORFR f Nos. with SAG Ref Nos. 



ORFRefN . 1 SAGxxxxR fNo. | 


aa \A 


jin tati n j 


ORF00419 i 


SAG0373 | 


241 v 


\BC transporter, ATP-blnding protein I 


ORF00421 I 


SAG0374 


344 I / 


\BC transporter, permease protein I 


ORF00422 


SAG0375 1 


266 < 


conserved hypothetical protein I 


ORF00423 I 


SAG0376 


211 < 


conserved hypothetical protein TIGR00091 I 


ORF00424 I 


SAG0377 


127 < 


conserved hypothetical protein, POINT MUTATION | 
, — 


ORF00425 1 


SAG0378 


379 


M utilization substance protein A | 


ORF00426 J 


SAG0379 I 


98 I 


conserved hypothetical protein ~| 


ORF00427 


SAG0380 


100 


ribosomal protein L7A family \ 


ORF00428 


SAG0381 


927 | 


translation initiation factor IF-2 1 


ORF00429 


SAG0382 | 


122 


ribosome-binding factor A 1| 


ORF00430 i 


SAG0383 | 


334 


conserved hypothetical protein 1 


ORF00431 j 


SAG0384 f 


138 


transcriptional repressor CopY 1 


ORF00432 l 


SAG0385 | 


744 


copper-transporter ATPase CopA 


ORF00433 I 


SAG0386 f 


68 I 


copper-transporter protein CopZ 1 


ORF00434 


SAG0387 | 


204 


conserved hypothetical protein j 


ORF00435 \ 


SAG0388 | 


270 


hydrolase, haloacid dehalogenase-like family j 


ORF00436 


SAG0389 


880 


DNA polymerase 1 


ORF00437 | 


SAG0390 


146 


CoA binding domain protein _j 


I ORF00438 I 


SAG0391 | 


159 | 


transcriptional regulator, Fur family 1 


ORF00439 | 


SAG0392 


521 


cell wall surface anchor family protein j 


ORF00440 J 


SAG0393 


228 


DNA-binding response regulator 


r\ppnn^ii I 


SAG0394 I 


345 


sensor histidine kinase 




SAG0395 


246 


conserved hypothetical protein 




SAG0396 


! 380 


queulne tRNA-ribosyltransferase 


urxiuu'i'i'j j 


SAG0397 


I 102 


conserved hypothetical protein 1 


ORF00445 ■ 

w»r\i i 


SAG0398 


i 179 


bloY family protein 


ORF00446 


SAG0399 


258 


AtsA/ElaC family protein 1 


ORJF00447 


SAG0400 


! 158 


cytidine/deoxycytidylate deaminase family protein 


ORF00448 


SAG0401 


44 


1 hypothetical protein j 


ORF00449 


I SAG0402 


j 449 


glucose-6-phosphate isomerase 1 


ORF00450 


SAG0403 


I 176 


5-formyitetrahydrofolate cyclo-ligase family protein 
L 


ORF00451 


| SAG0404 


225 


1 rhomboid family protein | 


ORF00452 


I SAG0405 


347 


I lipoprotein 1 


ORF00453 


SAG0406 


I 299 


1 UTP-glucose-1 -phosphate uridylyltransferase I 


ORF00454 


I SAG0407 


I 338 


| glycerol-3-phosphate dehydrogenase (NAO(P)+) j 


ORF00455 


SAG0408 


I 109 


I ribonuclease P protein component J 


i ORF00456 


§ SAG0409 


| 271 


I — ^ III 1 £ • | A \ 1 

1 SpollU family protein 1 


ORF00458 


I SAG0410 


273 


1 R3H domain protein 


ORF00463 


SAG0411 


I 177 


J consented hypothetical protein 1 


ORF00464 


| SAG0412 


258 


1 RecX protein 


ORF00465 


SAG0413 


451 


RNA methyltransferase, TrmA family 1 


ORF00466 


SAG0414 


153 


j conserved hypothetical protein j 


ORF00467 


* SAG0415 


142 


j acetyltransferase, GNAT family j 


ORF00468 


SAG0416 


1233 


(protease, putative j 


ORF00469 


SAG0417 


| 302 


1 giycosyi iransierase, group ^ larniiy pi u win a 


ORF00470 


SAG0418 


I 336 


ribonucieoside-diphosphate reductase 2, beta subunit 


ORF00471 


I SAG0419 


137 


j nrdl protein 1 


ORF00472 


SAG0420 


721 


I ribonucieoside-diphosphate reductase 2, alpha subunit 1 


ORF00473 


1 SAG0421 


1055 


1 conserved hypothetical protein _j 


ORF00474 


SAG0422 


129 


1 conserved hypothetical protein _J 


ORF00475 


| SAG0423 


I 132 


1 conserved domain protein _] 
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Tabl 32: Conv rsi n of ORF Ref Nos. with SAG Ref Nos. 



| ORF Ref N . I * 


5 AGxxxx Ref No. I 


aa |A 


nnotatl n 


ORF00476 1 


SAG0424 $ 


94 


hypothetical protein _J 


1 ORF00478 


SAG0425 


105 c 


^rboxymuconolactone decarboxylase family protein j 


\ ORF00479 [ 


SAG0426 


131 I < 


conserved hypothetical protein J 


{ ORF00480 1 


SAG0427 


129 t 


ranscriptional regulator, MerR family j 


ORF00482 


SAG0428 


345 4 


alcohol dehydrogenase, zino-containlng '\ 


| ORF00483 


SAG0429 


284 |< 


sxidoreductase, aldo/keto reductase family I 


1 ORF00484 


SAG0430 


287 | cation efflux system protein I 


ORF00485 f 


SAG0431 


174 I 




1 ORF00486 \ 


SAG0432 


397 I 


transcriptional regulator, AraC family l 


J ORF00487 j 


SAG0433 i 


1389 I 


surface protein Rib I 


ORF00468 


SAG0434 I 


61 


transposase, IS256 family, truncation l 


ORF00489 I 


SAG0435 


fl7 1 

»7 | 


L/INAV'Ugu HdyC3~ll lUUlrlUlt? JJI ULCII I **| j^uiauiro * 


j ORF00490 1 


SAG0436 ; 


62 1 


nypoineucai proiein i 


ORF00491 I 


SAG0437 I 


123 1 


nypoineucai proiein j 


ORF00493 I 


SAG0438 j 


Jt AC \ 

145 1 


Dacienopnage i_o**<*, inwsijiaot?, uuiiwauwn i 


I ORF00495 I 


SAG0439 




Mncanfafi tt%in<^tKa41^ol nrntotn FR AM F SHIFT 1 

conservea nypoineiK^u prows hi, rr\r\m&omi i i 


I ORF00496 


SAG0440 


84 J 


conservea nypoineucai proieiii j 


ORF00497 J 


SAG0441 


103 | 


COfistsrvBU uuiiicaiii piuiain i 


| ORF00499 | 


SAG0442 


189 


aceiyiuarisierase, \ai w i lanmy i 


I ORF00500 


SAG0443 


194 


aceiyiiran^iei vji>mai lomuy j 




SAG0444 


188 


vunscrvcu iiypuuioiiwai $ 


I oRPnn^n? l 


SAG0445 I 


883 


\#al\#l.tRNJA Qvnth^taBS 1 
Voiy I ir\i>i/A ayiiuiciaoc 


1 HRFfin^OS ! 


SAG0446 


j 319 


UAIUUI vmU VlQs V| WIWllwui'ivw i ■ ■ Jf y 


i DRpnn^n4 


SAG0447 


| 287 


IllClM IHSOIMIII U Cll lopwi wi | iuiiiiij ^^^« 


1 vrxrUUuuO 


SAG0448 


I- 391 


trans oosase IS256 family 1 


1 UrxrMUJUl 


SAG0449 


354 


conserved hypothetical protein | 


1 ORF005Q8 


SAG0450 


I 330 


| aspartate— ammonia ligase $ 


1 ORF00510 


SAG0451 


I 149 


j bacteriocin transport accessory protein.putative j 


I ORF00511 


SAG0452 


S 179 


type II DNA modification methyltransferase, putative 1 


ORF00512 


| SAG0453 


) 98 


1 hypothetical protein 1 


} ORF00513 


SAG0464 


\ 161 


1 phosphopantetheine adenylyltransferase j 


I ORF00515 


SAG0455 


|_ 357 


1 conserved hypothetical protein j 


ORF00518 


| SAG0456 




1 conserved hypothetical protein, degenerate 1 


ORF00519 


SAG0457 


| 192 


| conserved hypothetical protein J 


I ORF00520 


I SAG0458 | 368 


conserved hypothetical protein TIGR00048 


\ ORF00521 


I SAG0459 


I 171 


1 VanZF domain protein ^ | 


ORF00522 


! SAG0460 


581 


| ABC transporter, ATP-bindingfpermease protein J 


ORF00523 


SAG0461 


579 


1 ABC transporter, ATP-binding/permease protein j 


ORF00524 


SAG0462 


188 


1 anthranilate synthase component II | 


ORF00525 


SAG0463 


179 


1 bioY family protein ^ 1 


ORF00526 


SAG0464 


i 330 


1 biotin synthetase 1 


J ORF00527 


| SAG0465 


I 164 


1 hypothetical protein 1 


ORF00528 


SAG0466 


S 371 


I thiolase 1 


ORF00531 


SAG0467 


| 409 


1 AMP-binding enzyme domain protein | 


| ORF00532 


SAG0468 


210 


1 endonuclease 111 1 


ORF00533 


SAG0469 


131 


1 type IV prepilin peptidase-related protein J 


ORF00534 


SAG0470 


I 69 


conserved hypothetical protein j 


ORF00535 


SAG0471 


322 


1 glucokinase 1 


| ORF00536 


i: SAG0472 


j 126 


1 rhodanese domain protein 1 


ORF00537 


SAG0473 


613 


elongation factor Tu family protein 


ORF00538 


I SAG0474 


\ 81 


conserved hypothetical protein "j 


ORF00540 


SAG0475 


451 


UDP-N-acetylmuramoy!alanine-D-glutamate ligase 
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Tabl 32: C nversi nofORFR f Nos. with SAG Ref N s. 



ORF Ref No 1 * 


SAGxxxx Ref No. I 


aa |Ann tation _J| 


ORF00541 [ 

V 


SAG0476 


358 1 1 
ft 
a 


JDP-N-acety!glucosamine— N-acetylmuramyl- J 
lentapeptide) pyrophosphoryl-undecaprenol N- I 
cetylglucosamine transferase I 


ORF00542 


SAG0477 ; 


378 c 


^ell division protein DivlB, putative 1 


ORF00544 I 


SAG0478 


429 c 


sell division protein FtsA j 


ORF00545 


SAG0479 


426 l< 


;ell division protein FtsZ j 


ORF00546 | 


SAG0480 


224 I ) 


flmE protein, putative 1 


ORF00547 


SAG0481 


201 [J 


/ImF protein 1 


ORF00548 I 


SAG0482 


84 | ' 


YGGT family protein I 


ORF00549 


SAG0483 i 


262 I * 


/ImH protein _ _ f 


ORF00550 | 


SAG0484 


256 < 


cell division protein DivlVA, putative J 


ORF00552 I 


SAG0485 i 




enlAi iawURMA cx/nthptfflQP 1 
oulcUwyi uMin oymiiciaoc % 


ORFfi0553 I 


SAG0486 | 


100 I 


conserved hypothetical protein ] 


ORF00554 J 


SAG0487 


151 j 


MutT/nudix family protein m 1 


ORF00555 


SAG0488 j 


753 J 


ATP-dependent Clp protease, ATP-binding subunit I 
■ : 1 


UKr UUOOD | 




34 I 


hypothetical protein 


ADCAACC7 li 


cAf^n^on I 


76 I 


conserved hypothetical protein J 


QKrQuOOo I 


QAriO401 I 


230 


amino acid ABC transporter, pei mease protein [ 


ORrQuooSJ | 


QAfSfYAQO 1 


244 | 


amino acid ABC transporter, Ai r-Dinaing proiein g 


ORrOOooO 1 




564 

I 


phosphoglucomutase/phospnomannomuiase Tamiiy * 
protein 1 


DRPnn^R9 1 


SAG0494 1 


284 | 


metnyienetexranyaroToiaie j 
ei phuH mnp na ^e/methen vltetrah vdrofolate 1 
cvclohvdrolase 1 


ORF00563 


SAG0495 


278 


conserved hypothetical protein 1 


ORF00564 


SAG0496 


446 


exodeoxyribonuctease VII, large subunit | 


ORF00565 


SAG0497 


i 71 ) 


exodeoxyribonuclease VII. small subunit 1 


ORF00566 


SAG0498 


| 290 j 


geranyltranstransferase, putative 


ORF00567 


j SAG0499 


j 275 


hemolysin A | 


ORF00568 


| SAG0500 


| 157 


arginine repressor ArgR, putative j 


ORF00570 


i SAG0501 


| 552 


DNA repair protein RecN 1 


ORF00571 


I SAG0502 


j 278 


DegV family protein 1 


ORF00572 


SAG0503 


| 279 


Upase/Acylhydrolase, putative 1 


ORF00573 


; SAG0504 


I 200 


conserved hypothetical protein 1 


ORF00574 


SAG0505 


! 91 


DNA-binding protein HU j 


ORF00575 


SAG0506 


I 65 


hypothetical protein I 


ORF00576 


SAG0607 


S 310 


dihydroorotate dehydrogenase A j 


ORF00577 


: SAG0508 


i 411 


1 beta-lactam resistance factor _l 


ORF00578 


\ SAG0509 


I 403 


1 beta-lactam resistance factor J 


ORF00579 


£ SAG0510 


| 406 


1 murM protein, putative 1 


ORF00580 


i SAG0511 


270 


I hydrolase, haloacld dehalogenase-like family 1 


ORF00581 


SAG0512 


I 438 


1 HD domain orotein 1 


ORF00582 


l SAG0513 


I 128 


1 conserved hypothetical protein | 


ORF00583 


SAG0514 


894 


cation-transporting ATPase, E1-E2 family i 


ORF00584 


; SAG0515 


286 


I conserved hypothetical protein I 


ORF00585 


f SAG0516 


643 


I fructose-1 ,6-bisphosphatase, putative _l 


ORF00586 


SAG0517 


374 


iron-sulfur cluster-binding protein, putative "^J 


ORF00587 


SAG0518 




| peptide chain release factor 2, FRAMESHIFT j 


ORF00588 


SAG0519 


230 


cell division ABC transporter, ATP-binding protein FtsE 


ORF00589 


SAG0520 


309 


I cell division ABC transporter, permease protein FtsX 


ORF00590 


i SAG0521 


| 236 


I carboxymethylenebutenolidase-related protein 


ORF00591 


SAG0522 


| 232 


| metallo-beta-lactamase superfamily protein | 
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Tabl 32: Conv rsi n of ORF Reff Nos. with SAG Ref Nos. 



1 ORF R fNo. 


SAGxxxxR f No. ] 


aa A 


nnotation I 


ORF00592 


SAG0523 


254 c 

ft 


>xidoreductase, short chain dehydrogenase/reductase I 
imily ,1 


J ORF00593 


SAG0524 I 


835 I 
h 


DNA polymerase III, epsllon subunit/ATP-dependent 1 
elicase DinG I 


S ORF00595 


SAG0525 


397 « 


aspartate aminotransferase j 


l ORF00596 


SAG0526 i 


448 J 


asparaginyl-tRNA synthetase 1 


ORF00597 


SAG0527 | 


185 i 


conserved hypothetical protein 1 


I ORF00598 


SAG0528 


327 


nosine-uridine preferring nucleoside hydrolase j 


i ORF00599 


SAG0529 


38 


hypothetical protein j 


ORF00600 


SAG0530 ! 


137 


OsmC/Ohr family protein 1 


ORF00601 


SAG0531 


296 


conserved hypothetical protein 1 


| ORF00602 


SAG0532 


324 


conserved hypothetical protein | 


3 ORF00603 


SAG0533 j 


303 


Uncharacterized BCR, COG1481 | 


ORF00604 


SAG0534 | 


465 


dipeptidase I 


I ORF00605 


SAG0535 } 


506 


zinc ABC transporter, zinc-binding adhesion liprotein | 

, I 


l ORF00606 


SAG0536 | 


86 


ribosomal protein L31 j 


ORF00607 


SAG0537 || 


311 


DHH family protein I 


| ORF00608 


SAG0538 | 


340 


adenosine deaminase, putative 


I ORF00609 


SAG0539 ; 


147 


flavodoxin _J 


I ORF00610 


SAG0540 | 


91 


chorismate mutase, putative 


| ORF00611 


SAG0541 i 


398 


voltage-gated chloride channel family protein I 


I ORF00612 


SAG0542 | 


127 


IS1 381 , transposase 6rf A j 


| ORF00613 


SAG0543 


129 


1S1 381 , transposase OrfB _J 


I ORF00614 


SAG0544 


115 


ribosomal protein L1 9 __] 


| ORF00615 


SAG0545 


359 


site-specific recombinase, phage integrase family I 

I 


I ORF00617 


SAG0546 


67 


conserved domain protein I 


ORF00618 


SAG0547 


185 


hypothetical protein 


I ORF00619 


SAG0548 


i 265 


repressor protein, putative = I 


ORF00620 


SAG0549 


I 47 


hypothetical protein 1 


ORF00621 


SAG0550 


I 74 


conserved hypothetical protein 


ORF00622 


SAG0551 


i 52 


conserved hypothetical protein J 


ORF00623 


SAG0552 


! 62 


hypothetical protein 1 


ORF00624 


SAG0553 


268 


hypothetical protein 


ORF00626 


SAG0554 


63 


transcriptional regulator, Cro/CI family ~ j 


! ORF00627 


SAG0555 


S 249 


antirepressor, putative 


ORF00628 


SAG0556 


47 


hypothetical protein 1 


I ORF00630 


| SAG0557 


I 76 


hypothetical protein j 


i ORF00632 


SAG0558 


74 


hypothetical protein j 


! ORF00633 


SAG0559 


I 286 


conserved hypothetical protein 1 


ORF00634 


SAG0560 


I 77 


conserved hypothetical protein 


I ORF00635 


SAG0561 


46 


hypothetical protein 


ORF00636 


SAG0562 


84 


hypothetical protein j 


j ORF00637 


SAG0563 


53 


hypothetical protein 1 


I ORF00638 


SAG0564 


I 160 


conserved hypothetical protein | 


| ORF00639 




I OOA 


rnn^pn/pd domain Droteln D 


ORF00640 


SAG0566 


I 138 


single-strand binding protein _J 


ORF00641 


SAG0567 


439 


reverse transcriptase/maturase family protein j 


f ORF00642 


SAG0568 


j 67 


conserved hypothetical protein 


I ORF00643 


SAG0569 




conserved hypothetical protein 


ORF00644 


SAG0570 


i us 


hypothetical protein 


ORF00645 


SAG0571 


43 


hypothetical protein j 


ORF00646 


SAG0572 


138 


conserved hypothetical protein I 


I ORF00647 


SAG0573 


I 54 


hypothetical protein 1 
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Table 32: Conversi n 



f ORF Ref Nos. with SAG R f Nos. 



ORF R fN . 1 SAGxxxxR f N . | 


aa |Ann tation 


ORF00648 1 


SAG0574 S 


89 |c 


unserved hypothetical protein 1 


ORF00649 


SAG0575 


110 j 


lypothetical protein I 


ORF00650 1 


SAG0576 


43 I 


hypothetical protein 1 


ORF00652 f 


SAG0577 | 


177 Ic 


unserved hypothetical protein 1 


ORF00653 i; 


SAG0578 


88 j < 


conserved hypothetical protein 1 


ORF00654 1 


SAG0581 


118 {< 


conserved hypothetical protein _ 1 


ORF00655 1 


SAG0582 


422 It 


conserved hypothetical protein 1 


ORF00656 


SAG0583 


406 < 


conserved hypothetical protein 1 


ORF00657 i 


SAG0584 I 


62 \i 


conserved hypothetical protein, truncation j 


ORF00658 1 


SAG0585 


471 


conserved hypothetical protein S 


ORF00659 


SAG0586 


154 


conserved hypothetical protein I 


ORF00660 i 


SAG0587 


300 


structural protein, putative j 


ORF00661 J 


SAG0588 


71 I 


conserved hypothetical proxein j 


ORF00662 


SAG0589 I 


143 | 


conserved hypothetical protein 1 


ORF00683 


SAG0590 I 


112 


conserved hypothetical protein j 


ORF00664 


SAG0591 


78 


conserved hypothetical protein I 


ORF00665 


SAG0592 I 


111 I 


conserved hypothetical protein 1 


ORF00666 


SAG0593 


185 I 


structural protein 1 


ORF00667 

nRPflHRfift 1 

ORF00669 | 
ORF00670 1 


SAG0594 
SAG0595 I 
SAG0596 
SAG0597 I 


81 

123 S 

670 

506 


conserved hypothetical protein j 
conserved hypothetical protein 1 
PbIA, internal deletion j 
minor structural protein, putative 1 


ORFQ0671 
ORF00672 ( 

UKrUuOf* ] 

ORF00675 

ORF00676 

ORF00677 

ORF00678 


SAG0598 
SAG0599 !t 

anwuuuu | 

SAGQ601 J 

SAG0602 

SAG0603 
) SAG0604 
i SAG0605 


1374 

668 ! 

109 

70 

100 
I 111 
I 239 

323 


minor structural protein, putative j 
minor structural protein, putative 1 
hypothetical protein 1 

[ hypothetical protein J 
conserved hypothetical protein 1 
holin, putative 1 
lysin, putative I 

1 conserved hypothetical protein j 


nnFflfififii 

nRF0D682 


i SAG0606 
\ SAG0607 
| SAG0608 
] SAG0609 


66 
I 56 
! 59 

I 193 


1 conserved hypothetical protein " ~ 1 
1 conserved hypothetical protein 1 
1 hypothetical protein 1 
1 site-specific recombinase, phage integrase family 1 
J j 


ORF00685 


I SAG0610 


134 


1 conserved hypothetical protein 1 


ORF00887 
ORF00689 


S SAG0611 
I SAG0612 


53 


1 transposase, degenerate FRAMESHIFT J 
conserved hypothetical protein, FRAMESHIFT j 
transmembrane protein Vexpl J 


ORF00690 
ORF00691 
ORF00692 


SAG0613 
SAG0614 
j SAG0615 


425 
| 218 
I 458 


ABC transporter. ATP-binding protein Vexp2 j 
I transmembrane protein Vexp3 I 


ORF00693 


i SAG0616 


217 


I DNA-binding response regulator VncR J 


ORF00694 


! SAG0617 


j 439 


I sensor histidine kinase VncS — — — — i 


ORF00695 


f SAG0618 


j 195 


I transposase OrfB, IS3 family, truncation 


ORF00697 


SAG0619 


I 66 


I conserved hypothetical protein I 


ORF0Q698 


I SAG0620 


I 62 


1 nypoiricuu«i pivuzm ■ 


ORF00699 


SAG0621 


! 401 


| rod shape-determining protein RodA, putativeD 1 


ORF00700 


SAG0622 


l 186 


1 hydrolase, haloacid dehalogenase-like family 


ORF00701 


SAG0623 


I 650 


j DNA gyrase, B subunit 1 


ORF00702 


| SAG0624 


574 


1 septation ring formation regulator EzrA, putative | 


ORF00703 


S SAG0625 


213 


1 phosphoserine phosphatase SerB 1 


ORF00704 


SAG0626 


| 161 


1 MutT/nudix family protein 1 


ORF00705 


SAG0627 


151 


1 conserved hypothetical protein 1 


ORF00706 


SAG0628 


| 435 


I enolase 1 


ORF00707 


SAG0829 


| 354 


1 conserved domain protein 1 
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Table 32: C nv rsion of ORF Reff Nos. with SAG R f Nos. 



1 ORF Ref No. 1 * 


SAGxxxx Ref No. I 


aa |Ann tation 


1 ORF00708 


SAG0630 


427 | 2 


kphosphoshikimate 1-carboxyvinyltransferase 


ORF00709 1 


SAG0631 | 


170 I s 


hikimate kinase 


| ORF00710 I 


SAG0632 \ 


457 h 


)sr protein 


ORF00711 


SAG0833 | 


I I « 




ORF00712 i 


SAG0634 


f V | 


hvnothetical orotein 


1 ORF00713 1 


SAG0635 < 


1 i 


acid phosphatase precursor, class B 


I ORFQ0714 


SAG0636 


179 I i 


-■on served hvoothetica! protein ! 


I ORF00717 ! 


SAG0637 


1 j 
Jp 


JCil I9vl lUUvf 1 9 1 i ^^ w '* iW< i * * * ■ 9 # f ^ • 

: RAMESH1FT 


ORF00718 J 


SAG0638 


109 I 


sell wall surface anchor family protein 


ORF00720 


SAG0639 I 


273 I 


transposase OrfB, IS3 family 


ORF00721 I 


SAG0640 


91 J 


transposase OrfA, IS3 famiiy 1 


I ORF00722 | 


SAG0641 I 




Tn5252, Orf 10 protein, degenerate POINT MUTATION 


j ORF00723 J 


SAG0642 9 


59 


hypothetical protein 


ORF00725 


SAG0643 




chaperonin, 33 kDa DEGENERATE f 


ORF00726 


SAG0644 | 


402 j 


transcriptional regulator, AraC family 


| ORF00727 I 


SAG0645 | 


554 | 


cell wall surface anchor family protein, putative 


j ORF00728 I 


SAG0646 


307 I 


cell wall surface anchor family protein 


ORF00729 I 


SAG0647 j 


305 | 


sortase family protein 


| ORF00731 S 


SAG0648 


260 | 


sortase family protein 


1 ORF00732 


SAG0649 


890 I 


ceil wall surface anchor family protein,putative 


1 ORF00734 


SAG0650 


189 


sortase family protein, FRAMESHIh I 


ORF00735 i 


SAG0651 


201 


hypothetical protein 


t ORF00737 1 


SAG0853 


76 I 


conserved hypothetical protein, DEGENERATE 


ORF00738 f 


SAG0654 


34 


hypothetical protein 


ORF00740 


SAG0656 


36 


hypothetical protein _j 


ORF00741 


SAG0657 


I 89 


hypothetical protein 


| ORF00742 


SAG0658 


| 383 


lipoprotein, putative 


ORF00743 


J SAG0659 


I 330 


ABC transporter, ATP-binding protein 


ORF00744 


| SAG0660 . 


272 


| membrane protein 


j ORF00745 


{ SAG0661 


l 261 


| conserved hypothetical protein 


ORF00747 


t 8AG0663 


I 282 


cylD protein 


i ORF00748 


» SAG0664 


I 240 


I cylG protein 


f ORF00749 


? SAG0665 


| 101 


acyl carrier protein AcpC 


ORF00750 


SAG0666 


| 158 


cylZ protein FRAMESHIFT 


ORF00751 


I SAG0667 


S 309 


I cytA protein 


l ORF00752 


| SAG0688 


292 


cylB protein 


ORF00753 


[ SAG0669 


I 667 


cylE protein 


< ORF00754 


! SAG067O 


I 317 


I cylF protein 


[ ORF00755 


| SAG0671 


I 731 


I cyll protein 


I ORF00756 


SAG0672 


403 


cylJ protein I 


I ORF00757 


SAG0673 


I 191 


j cylK protein 


I ORF00758 


\ SAG0674 


113 


I hypothetical protein 


ORF00759 


I SAG0875 


I 171 


I surface protein antigen-related protein | 


ORF00760 


j SAG0676 


| 885 


I serine protease, subtilase family, putative j 


i ORF00761 


i SAG0677 


1062 


hypothetical protein 


ORF00762 


SAG0678 




endopeptidase O DEGENERATE 


! ORF00766 


SAG0679 


286 


I hydrolase, alpha/beta fold family, putative 


ORF00767 


SAG0660 


339 


I hypothetical protein J 


t ORF00768 


SAG0881 


l 353 


| conserved domain protein 


ORF00769 


SAG0682 


I 409 


I permease, putative 


ORF00770 


SAG0683 




transmembrane protein Vexp3, putativ FRAMESHIFT 


| ORF00774 


I SAG0684 


I 223 


I ABC transporter, ATP-binding protein 
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Table 32: Conversi n ff ORF R f N s. with SAG Ref N s. 



1 ORF Ref N . 1 5 


JAGxxxxR f No. I 


aa (Annotation I 


ORF00775 j 


SAG0885 


472 |c 


onserved hypothetical protein j 


ORF00776 


SAG0686 


261 C 


)NA-entry nuclease, putative I 


ORF00777 


SAG0687 


212 [ 


)edA family protein, putative J 


ORF00778 


SAG0688 I 


218 / 


KBC transporter. ATP-binding protein j 


ORF00779 i 


SAG0689 


257 |r 


nembrane protein, putative | 


I ORF00780 


SAG0690 


272 < 


unserved hypothetical protein f 


ORF00781 


SAG0691 i 


294 |t 


ranscriptional regulator, LysR family I 


ORF00783 | 


SAG0692 


193 I regulatory protein, putative ^ 


[ ORF00785 ( 


SAG0693 


377 | lS1548,transposase j 


I ORF00786 1 


SAG0694 I 


173 | I 


regulatory protein, puiauvo, uui unguium 


1 ORF00787 1 


SAG0695 


330 | I 


□-lactate aenyarogenase i 


} ORF00788 J 


SAG0696 


516 1 : 


sooium.gaiactosiae symponer lanwy iJiuwmi, H u * ai,w * ? i 


ORF00789 


SAG0697 


OH 1 | 


o.b'of/v.'^Hpnwnfiiconate kinase K 


ORF00790 


SAG0698 


CQQ 


Hfktai.nl i ir*i irnnirt 9 <t ft f 
Uclcl -yiuuui Vll IIUQDD J 


[ ORF00791 


SAG0699 l 




iranSCfipilOlidl iGjJiiialui, viur\ lauiiij ^ J 


ORF00792 


SAG0700 | 


205 1 


2-dehyQro-3-deoxypnospnogiUGonaie aiuoiaoe/*t- | 






h»\#rlrriw/-.^-r\vnnliitnrfjtA aldolase i 


! ORF00793 


SAG0701 | 


Add I 


\juUmJIUIIalt3 lowiiivi aoo l 


i ORF00794 j 


SAG0702 


KkAQ 1 

o*to | 


llJal 11 twiiciK? uci ly aiuv& .,„ I 


i ORF00795 | 


SAG0703 l 


c*i\i | 


L/~ llldlll IVIUCIIC V/AIVJVH CUMVUIflv ( 


| ORF00796 i 


SAG0704 


6lU | 


hvrirotase haloacld dehatoaenase-like family 1 


ORF00797 


SAG0705 


1 




ORF00798 j 


SAG0706 


361 | 


proline dipeptidase 1 


ORF00799 I 


SAG0707 


«jO*t I 


transcriotional reaulator. ReqM family 1 


! ORF00800 | 


SAG0708 


488 1 


alpha amylase family protein I 


| ORF00801 || 


SAG0709 


332 | 


1 glycosyl transferase, group 1 family protein 9 


I ORF00802 j 


\ SAG0710 


1 444 | 


| glycosyl transferase, group 1 family protein I 


ORF00803 


SAG0711 


1 647 | 


I threonyMRNA synthetase j 


I ORF00804 


\ SAG0712 


i 234 i 


DNA-binding response regulator 


ORF00805 


| SAG0713 


| 339 1 


I conserved hypothetical protein j 


ORF00806 


| SAG0714 


1 1*B 


conserved hypothetical protein j 


| ORF00807 


] SAG0715 


I 216 


I amino acid ABC transporter, permease protein | 


I ORF00808 


I SAG0716 


I 231 


I amino acid ABC transporter, permease protein 1 


I ORF00809 


I SAG0717 


266 


1 amino acid ABC transporter, amino acid-binding ~~| 






Iprotein 1 


5 ORF00810 


I SAG0718 


I 251 


| amino acid ABC transporter, mi r-Dinaing proiein i 


I ORF00811 


I SAG0719 


236 


1 DNA-binding response regulator 1 


I ORF00812 


I SAG0720 


449 


1 sensory box histidine kinase 1 


I ORF00813 


l SAG0721 


269 


j metallo-beta-lactamase family protein j 


ORF00814 


I SAG0722 


122 


1 conserved hypothetical protein j 


ORF00815 


| SAG0723 


236 


1 ribonuclease III 


ORF00816 


i SAG0724 


1179 


1 SMC family protein I 


| ORF00817 


SAG0725 


265 


hydrolase, haloacld dehalogenase-iike family 1 


If ORF00818 


| SAG0726 


274 


j hydrolase, haloacld dehalogenase-like family j 


! ORF00819 


! SAG0727 


I 536 


j signal recognition particle-docking protein FtsY j 


IT AQCAAOOA 

| ORrfJUoZQ 




i 270 


1 ABC transporter, substrate-binding protein J 


ORF00821 


| SAG0729 


J 300 


1 ABC transporter, permease protein, putative J 


ORF00822 


SAG0730 


42 


ABC transporter, ATP-binding protein 1 


ORF00823 


j SAG0731 


I 347 


J bacterial luciferase family protein _ 1 


ORF00824 


I SAG0732 


| 720 


I transcriptional accessory protein Tex, putative 


ORF00825 


* SAG0733 


142 


| conserved hypothetical protein j 


ORF00826 


I SAG0734 


I 87 


1 phage shock protein C, putative 1 


j ORF00827 


SAG0736 


I 44 


1 hypothetical protein , , 1 


| ORF00828 


| SAG0736 


| 311 


1 HPr(Ser) kinase/phosphatase | 
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Tabl 32: Conv rsion f ORF Ref Nos. with SAG R fNos. 



1 ORF Ref No. 1 SAGxxxx Ref No. 


aa I* 


annotation _J 


( ORF00830 | 


SAG0737 


257 1 1 


srolipoprotetn diacylgtyceryl transferase I 


1 ORF00832 1 


SAG0738 


132 < 


unserved hypothetical protein J 


1 ORF00833 


SAG0739 


143 i 


conserved hypothetical protein J 


| ORF00834 


SAG0740 


91 


conserved hypothetical protein j 


| ORF00835 1 


SAG0741 


303 


aeptidase, U32 family, putative I 


J ORF00836 


SAG0742 


428 


peptidase, U32 family _J 


ORF00837 1 


SAG0743 


70 


conserved hypothetical protein I 


ORF00838 1 


SAG0744 


265 


membrane protein, putative I 


1 ORF00839 I 


SAG0745 


446 


Mn2+/Fe2+ transporter, NRAMP family I 


i ORF00840 1 


SAG0746 


369 


riboflavin biosynthesis protein RibD I 


1 ORF00841 I 


SAG0747 


208 


riboflavin synthase, alpha subunit 1 


I ORF00842 I 


SAG0748 


397 | 


riboflavin biosynthesis protein RlbA 1 


\ ORF00843 I 


SAG0749 


156 f 


riboflavin synthase, beta subunit | 


I ORF00844 ! 


SAG0750 


496 


lysyMRNA synthetase ] 


I ORF00845 I 


SAG0751 


300 | 


hydrolase, haloacid dehalogenase-like family I 


I ORF00846 


SAG0752 


213 


phosphoglycerate mutase family protein 1 


1 ORFnnflA7 ! 


SAG0753 


157 | 


ebsC family protein, putative 1 


1 DRFnnftAft i 


SAG0754 


205 


conserved domain protein 1 


I ORFOOftfiO 1 


SAG0765 


282 I 


peptidase, U32 family 1 


i nRFonft52 I 


SAG0756 


174 


conserved hypothetical protein 


1 nRFnnn^ i 

| UrxrUUOOO I 


SAG0757 


129 ] 


lipoprotein, putative 1 


! ORF00855 ; 


QAfW7*?R 

OnOU f 9w 


599 | oligoendopeptidase F, putative . j 


1 rvncnnocc 1 


QAf2fY7AQ 
OMOU r 057 


931 


phosphoenolpyruvate carboxylase | 


Si oocrnAQC7 fe 


QAf3ft7fift 


377 


IS1548, transposase 


1 OKrUUOt>y 1 


OnVJUiv 1 


422 


cell division protein, FtsW/RodA/SpoVE family j 




CAtt07fi9 


398 


translation elongation factor Tu 


| UrxnUUoOo | 


SAG0763 

O/AVJ w i UJ 


252 


triosephosphate isornerase 


1 nppnnRR^ 1 

1 UKrUUOou i 


SAG0764 


230 


I phosphoglycerate mutase I 


i ORpnnnRR 1 


SAG0765 


681 


| penicillin-binding protein 2b _J 


%, nRFnnftft7 


SAG0766 


198 


I recombination protein RecR I 


1 OR F00868 


SAG0767 


348 


| D-alanine-D-alanine ligase J 


ORF00869 


SAG0768 

vJrVwW « ww 


455 


I UDP-N-acetylmuramoylalanyW>glutamy!-2 ( 6- 






diamlnopimelate-D-alanyl-D-alanyl ligase j 


1 OR FO 0870 


| SAG0769 


406 


I oxalate:formate antiporter 


5 ORF00871 


SAG0770 


228 


I conserved hypothetical protein I 


ORF00872 

f ( VI \l www V «■» 


SAG0771 


i 512 


I cell wall surface anchor family protein I 


I ORF00873 

■ 1 \ I www ■ w 


I SAG0772 


514 


I peptide chain release factor 3 j 


ORF00874 


I SAG0773 


126 


conserved hypothetical protein | 


1 npFfinft7fi 


I SAG0774 


244 


I ABC transporter, ATP-binding protein I 


1 a nRFonfl78 


I SAG0775 


220 


| ABC transporter, permease protein | 


1 ORF00879 


SAG0776 


276 


| lipoprotein.putative f 


| OKrUUooU 


I ^Af50777 


528 


ATP-dependent RNA helicase, DEAD/DEAH box j 






family , I 


1 ORF00882 
1 ur\ruwuu£. 


SAG0778 


88 


I conserved hypothetical protein 


} ORF00883 


I SAG0779 


254 


| conserved hypothetical protein 


\ ORF00884 


SAG0780 


246 


I acyltransferase family protein j 


1 ORF00885 


SAG0781 


217 


j competence protein CelA _J 


ORF00887 


SAG0782 


745 


DNA internalization-related competence protein 






|comEC/Rec2 


ORF00888 


SAG0783 


269 


I hydrolase, haloadd dehalogenase-like family I 


ORF00889 


SAG0784 


314 


I sugar-binding transcriptional regulator. Lad family I 


I ORF00890 


j SAG0785 


330 


I conserved hyp thetical protein 


ORF00891 


SAG0786 


242 


I conserved domain protein 


I ORF00892 


SAG0787 


| 345 


I DNA polymerase III, delta subunit, putativeQ | 
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Tabl 32: Conversi n of ORF Ref Nos. with SAG R fNos. 



ORF R f No. 



SAGxxxx Ref No. 1 aa 1 Annotation 



ORF00893 



SAG0788 



202 | superoxide dlsmutase, Fe-Mn 



ORF00894 



SAG0789 



283 | transcriptional antltermlnator LicT 



ORF00895 



SAG0790 | 622 I PTS system, beta-glucosides- specific HABC 
1 components 



ORF00896 



SAG0791 



475 \ 6-phospho-beta-glucosfdase 



ORF00898 



SAG0792 



364 | conserved hypothetical protein 



ORF00899 



SAG0793 



380 | conserved hypothetical protein TIGR0004S 



ORF00900 



SAG0794 



418 | permease, GntP family 



ORF00902 



SAG0795 



354 \ conserved hypothetical protein 



ORF00903 



SAG0796 



M7 j transcriptional regulator, MarR family 



ORF00904 



SAG0797 | 342 J S-adenosylmethionine:tRNA ribosyltransferase- 
lisomerase 



ORF00905 



SAG0798 



226 I membrane protein, putative 



ORF00906 



ORF00907 



SAG0799 



233 | glucosamine-6-phosphate isomerase 



SAG0800 



"318 1 Glutathione S-transferases domain protein 



ORF00908 



SAG0801 



239 l liposomal small subunit pseudouridine synthase 



ORF00909 



SAG0802 



38 | hypothetical protein 



ORF00910 



SAG0803 



383 | major facilitator family protein 



ORF00911 



SAG0804 



31 5 I competence protein CojA 



ORF00912 



ORF00913 



SAG0805 



601 j oligoendopeptidase B 



SAG0806 



208 \ hydrolase, haloadd dehalogenase-like family 



ORF00914 



SAG0807 



235 | O-methyltransferase family protein 



ORF00916 



SAG0808 



309 I protease maturation protein, putative 



ORF00918 



SAG0809 



161 I conserved hypothetical protein 



ORF00919 



SAG0810 



872 | alanyl-tRNA synthetase 



ORF00921 



SAG0811 



238 I membrane protein, putative 



ORF00922 



SAG0812 



272 I glycosyl transferase, family 8 



ORF00923 



SAG0813 



81 | hypothetical protein 



ORF00924 



ORF00925 



SAG0814 



95 \ conserved domain protein 



SAG0815 



Tl | transcriptional regulator/Cro/Cl family 



ORF00926 



SAG0816 



253 I conserved hypothetical protein 



ORF00927 



SAG0817 



187 l conserved hypothetical protein 



ORF00928 



ORF00929 



SAG0818 



319 I ribonucleoside-diphosphate reductase 2, beta subunit 



SAG0819 



"719 ribonucleoside-diphosphate reductase 2, alpha subunit I 



ORF00930 



ORF00931 



SAG0820 



"74 l ribonucleoside-diphosphate reductase 2, NrdH-redoxinl 



SAG0821 I 87 I phosphocarrier protein HPr 

SAG0822 | 577 phosphoenolpyruvate-protein phosphotransferase 



ORF00932 



ORF00933 



15AGQ823 | 475 glycera!dehyde-3-phosphate dehydrogenase, NADP- 
I dependent 



ORF00934 



SAG0824 



41 7 | polysaccharide deacetylase family protein 



ORF00935 



SAG0825 | 360 ATP-dependent RNA helicase, DEAD/DEAH box 
[family ___ 



ORF00936 



SAG0826 



209 I uridine kinase 



ORF00937 



ORF00938 



SAG0827 



165 I conserved hypothetical protein 



SAG0828 



554 I DNA polymerase 111, gamma and tau subunits 



ORF00939 



SAG0829 



"64 I conserved hypothetical protein 



ORF00940 



SAG0830 



■3T1 I biotin-acetyl-CoA-carboxylase ligase 



ORF00941 



ORF00942 



SAG0831 



SAG0832 



398 I S-adenosylmethlonine synthetase 



753 I hypothetical protein 



ORF00943 



SAG0833 



181 | hypothetical protein 



ORF00944 



SAG0834 



42 j hypothetical protein 



ORF00945 



SAG0835 



188 I conserved hypothetical protein 
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Tabl 32: Conv rsionofORFR f Nos. with SAG R fN s. 



ORF Ref N 



ORF00946 



ORF00948 



SAGxxxx Ref No. 



aa 1 Annotation 



SAG0836 



184 I conserved hypothetical protein 



SAG0837 



428 I ABC transporter, ATP-binding protein 



ORF00950 



SAG0838 



233 I hypothetical protein 



ORF00951 



SAG0839 



226 | transcriptional regulator, TenA family 



ORF00952 



SAG0840 l 265 | phosphomethylpyrimidine kinase 



ORF00953 



SAG0841 



256 l hydroxyethylthlazole kinase 



ORF00954 



SAG0842 | 223 | thiamine-phosphate pyrophosphorylase 



ORF00955 



SAG0843 419 UDP-N-acetylglucos amine 1-carboxyvinyltransferase 



ORF00956 



SAG0844 



184 | acetyltransferase. GNAT family 



ORF00957 



SAG0845 



427 [CBS domain protein 



ORF00958 



SAG0846 



286 | methionine aminopeptidase, type I 



ORF00959 



SAG0847 



306 | ribonudease BN, putative 



ORF00961 



ORF00962 



SAG0848 



1 51 \ GtrA family protein 



SAG0849 



169 | conserved hypothetical protein 



ORF00963 



SAG0850 



652 | DNA ligase, NAD-dependent 



ORF00964 



SAG0851 



339 l BmrU protein, putative 



ORF00966 



SAG0852 



766 | pullulanase, putative 



ORF00967 



SAG0853 



"622 1 1,4-alpha-glucan branching enzyme 



ORF00968 



SAG0854 



379 



I glucose-1 -phosphate adenylyltransferase 

glycogen biosynthesis protein GlgD FRAMESHIFT 



ORF00969 



SAG0855 



ORF00971 



ORF00972 



ORF00973 



ORF00974 



SAG0856 



SAG0857 



476 
66 



I glycogen synthase 



j ATP synthase F0, C subunit 



SAG0858 



238 \ ATP synthase F0. A subunit 



SAG0859 



165 l ATP synthase F0, B subunit 



SAG0860 I 1 78 I ATP synthase F1 , delta subunit ~ 
SAG0861 l 501 ATP synthase F1 , alpha subunif" 
SAG0862 l 293 j ATP synthase F1 , gamma subunit 



ORF00975 
ORF00976 
ORF00977 



ORF00978 



ORF00979 



SAG0863 



468 I ATP synthase F1 , beta subunit 



SAG0864 



137 | ATP synthase F1 , epsilon subunit 



ORF00980 



SAG0865 t 76 j conserved hypothetical protein 

SAG0866 I 423 | UDP-N-acetylglucosamine 1-carboxyvinyltransferase 



ORF00981 



ORF00982 



SAG0867 



63 I conserved hypothetical protein 



ORF00983 



ORF00984 



SAG0868 



285 | PNA-entrynuclease 



8AG0869 



346 I phenytalanyMRNA synthetase, alpha subunit 



ORF00985 



SAG0870 



1 73 I acetyltransferase, GNAT family 



ORF00986 



SAG0871 



801 | phenylalanyl-tRNA synthetase, beta subunit 



ORF00987 



SAG0872 



300 | conserved hypothetical protein 



ORF00988 



ORF00989 



SAG0873 



1077 | exonuclease RexB 



SAG0874 



1207 I exonuclease RexA 



ORF00990 



SAG0875 



305 I magnesium transporter, CorA family, putative 



ORF00991 



SAG0876 



458 I tRNA modification GTPase TrmE 



ORF00992 



SAG0877 



636 | ABC transporter, ATP-bindlng protein 



ORF00993 



SAG0878 | 322 I acetoin dehydrogenase, thymine PPi dependent, E1 
[component, alpha subunit 



ORF00994 



"SAG0879 j 332 acetoin dehydrogenase, thymine PPi dependent, E1 
Icomponent, beta subunit 



ORF00995 



SAG0880 | 462 j acetoin dehydrogenase, thymine PPi dependent, E2 
Icomponent, dihydrolipoamide acetyltransferase 



ORF00996 



SAG0881 | 585 I acetoin dehydrogenase, thymine PPi dependent E3 
Icomponent, dihydrolipoamide dehydrogenase 



ORF00997 



SAG0882 



329 | lipoate-protein ligase A 



ORF00998 



SAG0883 



"261 I cobyric acid synthase, putative 
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Table 32: Conv rsion of ORF R f Nos. with SAG Ref Nos. 



ORF Ref No. 


SAGxxxx Ref No. I 


aa A 


annotation 




SAG0884 


447 i 


tiur ligase family protein 


wr\rw IUUU 


SAG0885 


283 < 


unserved hypothetical protein TIGR00159 




SAG0886 


319 

p 


Gram-positive signal peptide, YSIRK family domain 
rotein 


ORF01002 


SAG0887 ! 


450 p 


>hosphoglucomutase/phosphomannomutase family 
irotein 


ORF01003 


SAG0888 


123 


conserved hypothetical protein ] 


ORF01004 


SAG0889 


126 


conserved hypothetical protein 


ORF01005 


SAG0890 


376 


oxygen-Independent coproporphyrinogen 111 oxidase, 
mutative 


ORF01006 


SAG0891 


245 j 


conserved hypothetical protein 


ORF01007 


SAG0892 


256 


hydrolase, haloacid dehalogenase-like family 


ORF01008 


SAG0893 


218 


conserved hypothetical protein 


ORF01009 


SAG0894 ] 


1370 


conserved hypothetical protein 


ORF01010 


SAG0895 | 


269 


lipoyl-binding domain protein 


ORF01011 


SAG0896 | 


108 


oxidoreductase, putative 


ORF01012 


SAG0897 | 


221 


conserved hypothetical protein 


ORF01013 


SAG0898 i 


83 


hypothetical protein 


ORF01014 


SAG0899 I 


57 


hypothetical protein 


ORF01015 


SAG0900 


56 


hypothetical protein j 


ORF01016 


SAG0901 


127 


hypothetical protein 


ORF01018 


SAG0902 [ 


45 


hypothetical protein 


ORF01019 


SAG0903 | 


44 


hypothetical protein 


ORF01021 


SAG0904 


56 


hypothetical protein 


ORF01022 


SAG0905 


138 


nucleoside diphosphate kinase | 


ORF01023 


SAG0906 


j 610 


GTP-binding protein LepA 


ORF01024 


SAG0907 


S 877 


streptococcal histidine triad family protein 


ORF01025 


SAG0908 


j 203 


HD domain protein 


ORF01026 


SAG0909 


S 154 


acetyltransferase. GNAT family 


ORF01027 


SAG0910 


J 144 


PilB-related protein 


I ORF01030 


SAG0911 


| 930 


cation-transporting ATPase, E1-E2 family 


ORF01031 


SAG0912 


367 


nucleoside diphosphate kinase domain protein 


ORF01032 


SAG0913 


212 


chloramphenicol acetyltransferase j 


ORF01033 


SAG0914 


f 203 


conserved hypothetical protein 


ORF01034 


SAG0915 


I 405 


Tn916, transposase 


ORF01035 


SAG0916 


67 


Tn916, excisionase 


ORF01037 


8AG0918 


76 


Tn916, hypothetical protein 


ORF01038 


SAG0919 


I 157 


Tn916, hypothetical protein 


ORF01039 


SAG0921 


I 117 


Tn916, transcriptional regulator, putative 


ORF01040 


SAG0923 


| 639 


Tn916, tetracycline resistance protein 


ORF01041 


SAG0925 


i 310 


Tn916, hypothetical protein j 


ORF01042 


SAG0926 


l 333 


Tn916, NLP/P60 family protein 


ORF01044 


SAG0927 


I 725 


Tn916, hypothetical protein FRAMESHIFT 


ORF01047 


j SAG0928 




Tn916. hypothetical protein FRAMESHIFT j 


ORF01048 


SAG0929 


168 


Tn916 f hypothetical protein 


ORF01049 


SAG0930 


165 


Tn916, hypothetical protein 


ORF01050 


J SAG0931 


J 73 


Tn91o, hypothetical proiein 


ORF01051 


SAG0932 


I 401 


Tn916, transcriptional regulator, putative 


ORF01052 


SAG0933 


461 


Tn916, FtsK/SpolllE family protein 


ORF01053 


SAG0934 


128 


Tn916, hypothetical protein 


ORF01054 


SAG0935 


I 104 


Tn916, hypothetical protein 


ORF01056 


SAG0937 




ABC transporter, ATP-binding protein, FRAMESHIFT 


ORF01057 


SAG0938 


122 


transcriptional regulator, GntR family 


ORF01058 


SAG0939 


I 1034 


DNA polymerase 111, alpha subunit 
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Tabl 32: Conversion of ORF R f N s. with SAG Ref N s. 



ORFRefN . 


SAGxxxx Ref No. 


aa / 


Annotation 


ORF01059 


SAG0940 


340 


6-phosph fructokinase 


ORF01060 


SAG0941 


500 


pyruvate kinase 


ORF01061 


SAG0942 


185 


signal peptidase 1, putative 


ORF01062 


SAG0943 


47 


hypothetical protein 


ORF01063 


SAG0944 


604 


glucosamine--fructose-6-phosphate aminotransferase 
Jsomerteing) 


ORF01064 


SAG0945 


377 


IS1548, transposase 


ORF01066 


SAG0946 


109 


phnA protein 


ORF01G68 


SAG0947 


213 


amino acid ABC transporter, permease protein 


ORF01069 


SAG0948 


209 


amino acid ABC transporter, ATP-binding protein 


ORF01070 


SAG0949 


276 


amino acid ABC transporter, amino acid-binding 
protein 


ORF01072 


SAG0950 


82 


ribosomal protein S20 


ORF01073 


SAG0951 | 


306 


pantothenate kinase I 


ORF01074 


SAG0952 | 


196 


conserved hypothetical protein 


ORF01075 


SAG0953 


129 


cytidine deaminase 


ORF01076 


SAG0954 


349 


lipoprotein 


ORF01077 


SAG0955 


511 


sugar ABC transporter, ATP-binding protein 




SAG0958 


353 


sugar ABC transporter, permease protein, putative 


ORF01079 


SAG0957 


318 


sugar ABC transporter, permease protein, putative 


ORF01080 


SAG0958 


456 


NADH oxidase f 


ORF01081 


SAG0959 


329 


L-lactate dehydrogenase 


ORF01082 


SAG0960 


819 


ONA gyrase, A subunit 


ORF01083 


SAG0961 


247 


sortase SrtA 


ORF01084 


SAG0962 


137 


glyoxylase family protein 


ORF01085 


SAG0963 


320 


conserved hypothetical protein 


ORF01086 


SAG0964 


375 


Na+/H+ exchanger family protein 


ORF01087 


SAG0965 


127 


fS1381, transposase OrfA 


ORF01088 


SAG0966 


129 


IS1 381 , transposase OrfB ! 


ORF01089 


! SAG0967 


520 


GMP synthase 


ORF01090 


l SAG0968 


232 


transcriptional regulator, GntR family 


ORF01091 


SAG0969 


444 


gid protein 


ORF01092 


SAG0970 


247 


acetyitransferase, GNAT family 


ORF01093 


SAG0971 


282 


lipoprotein, putative 


ORF01095 


SAG0972 




conserved hypothetical protein, FRAMESHIFT 


ORF01096 


! SAG0973 


320 


nisin-resistance protein, putative 


ORF01097 


SAG0974 


250 


ABC transporter, ATP-binding protein 


ORF01098 


SAG0975 


651 


ABC transporter, permease protein, putative 


ORF01099 


SAG0976 


222 


DNA-binding response regulator 


ORF01100 


SAG0977 


312 


sensor histidine kinase 


ORF01101 


SAG0978 


356 


site-specific recombinase, phage integrase family 


ORF01102 


SAG0979 


553 


ABC transporter, substrate binding protein, putative 


ORF01103 


SAG0980 


257 


conserved hypothetical protein 


ORF01104 


SAG0981 


\ 228 


SatD 


ORF01108 


SAG0982 


521 


signal recognition particle protein 


ORF01108 


SAG0983 


110 


conserved hypothetical protein 


ORF01109 


SAG0984 


437 


sensor histidine kinase CiaH 


I ORF01110 


SAG0985 


226 


DNA-binding response regulator CiaR 


ORF01111 


SAG0986 


849 


aminopeptidase N 


ORF01112 


SAG0987 


217 


phosphate transport system regulatory protein PhoU 



19 



Table 32: C nv rsion of ORF Ref Nos. with SAG R f 



a 

FNos 



ORF Ref No. I 


SAGxxxx Ref No. I 


aa 


lAnnotati n 


ORF01113 


SAG0988 ] 


252 


I phosphate ABC transporter, ATP-binding protein PstB, 
j putative 


ORF01114 


SAG0989 


267 


phosphate ABC transporter, ATP-binding protein PstB, 
mutative 


ORF01115 


SAG0990 


| 295 


I phosphate ABC transporter, permease protein PstA, 
mutative 


ORF01116 


SAG0991 


305 


I phosphate ABC transporter, permease protein 



ORF01117 



SAG0992 



ORF01118 



SAG0993 



436 I NQL1/NOP2/sun family protein 



ORF01119 



SAG0994 



254 | inositol monophosphatase family protein 



ORF01120 



ORF01121 



SAG0995 



"93 I conserved hypothetical protein 



SAG0996 



137 



conserved hypothetical protein 

310 macrolide-efflux protein mreA/nboflavin biosynthesis 
jprotein RibF 



ORF01122 



SAG0997 



ORF01123 



SAG0998 



294 | tRNA pseudouridine synthase B 



ORF01124 



SAG0999 



143 \ acetyltransferase, GNAT family 



ORF01125 



SAG1000 



423 I conserved hypothetical protein 



ORF01126 



SAG1001 



196 | conserved hypothetical protein 



ORF01127 



SAG1002 



292 | protease, putative 



ORF01128 



SAG1003 



876 | permease, putative 



ORF01129 



SAG1004 



233 l ABC transporter, ATP-binding protein 



ORF01131 



SAG1005 I 706 | PNA topoisomerase I 

SAG1006 I 280 I DprA/SMF protein, putative PNA processing factor 



ORF01132 



ORF01133 



SAG1007 | 342 j iron-compound ABC transporter, iron-compound- 

Ibinding protein __ 

SAG1008 | 253 I iron compound ABC transporter, ATP-binding protein 



ORF01134 



ORF01135 



SAG1009 



324 



iron compound ABC transporter, permease protein 
320 I iron compound ABC transporter, permease protein 



ORF01136 



SAG1010 



ORF01137 



SAG1011 



1 82 | acetyltransferase, CysE/LacA/LpxA/NodL family 



ORF01138 



SAG1012 



253 | ribonuclease HII 



ORF01139 



SAG1013 



283 I GTP-blnding protein 



ORF01140 



SAG1014 | 190 | conserved hypothetical protein 

SAG1015 j 494 j carbon starvation protein CstA, putative 



ORF01142 



ORF01143 



ORF01144 



ORF01145 



SAG1016 



244 | response regulator 



SAG1017 | 579 | sensor histidine kinase, putative 
SAG 101 8 | 40 I hypothetical protein "~ 



ORF01146 



SAG1019 



"39 I conserved hypothetical protein, FRAMESHIFT 



ORF01148 



SAG1020 



227 I hypothetical protein 



ORF01149 



ORF01150 



SAG1021 



107 | hypothetical protein 



SAG1022 



177 | hypothetical protein 



ORF01151 



SAG1023 



48 I hypothetical protein 



ORF01152 



SAG1024 



183 | hypothetical protein 



ORF01153 



SAG1025 



149 



hypothetical protein 

immunogenic secreted protein, DEGENERATE 



ORF01156 



SAG1026 



ORF01157 



SAG1027 



~84 | conserved hypothetical protein 



ORF01168 



SAG1028 



196 | hypothetical protein 



ORF01159 



SAG1029 



101 | hypothetical protein 



ORF01160 



SAG1030 | 304 I conserved hypothetical protein 

SAG1031 I 120 I extracellular protein t putative POINT MUATION 



ORF01161 



ORF01162 



ORF01164 



SAG1032 



85 \ conserved hypothetical protein 



SAG1033 



1309 I FtsK/SpolUE family protein 



ORF01166 



SAG1034 



55 l hypothetical protein 
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Table 32: Conv reion of ORF Ref Nos. with SAG Ref N s. 



1 ORF Ref No. | 


SAGxxxx Ref No. 


aa A 


mnotati n I 


ORF01167 


SAG1035 


424 < 


unserved hypothetical protein 


j ORF01168 I 


SAG1036 


80 ( 


conserved hypothetical protein j 


ORF01169 \ 


SAG1037 


157 


hypothetical protein j 


ORF01172 


SAG1038 


1003 


phage infection protein, putative 


i ORF01173 


SAG1039 


96 


conserved hypothetical protein 


| ORF01174 


SAG1040 


260 


conserved domain protein J 


ORF01175 


SAG1041 


107 


hypothetical protein f 


ORF01176 | 


SAG1042 


1060 


carbamoyl-phosphate synthase, large subunit j: 


| ORF01177 


SAG1043 


358 


carbamoyl-phosphate synthase, small subunit 


ORF01178 [ 


SAG1044 


307 


aspartate carbamoyltransferase j 


ORF01179 \ 


SAG1045 


430 


dihydroorotase, multifunctional complex type 


! ORF01180 


SAG1046 


209 


orotate phosphoribosyltransferase j 


ORF01181 j 


SAG 1047 


233 


orotidine 5'-phosphate decarboxylase 1 


ORF01182 


SAG1048 


410 


membrane protein, putative 


j ORF01183 


SAG1049 


513 


ABC transporter, ATP-binding protein 1 


I ORF01184 I 


SAG1050 


112 


ribonucleotide reductase, truncation j 


j ORF01185 \ 


SAG1051 


358 


aspartate-semialdehyde dehydrogenase 1 


ORF01186 


SAG1052 


47 


cell wall surface anchor family protein, putative 1 


ORF01187 j 


SAG1053 


30 


hypothetical protein f 


j ORF01188 | 


SAG1054 


531 


cardiolipin synthetase 1 


ORF01189 


SAG1055 


556 


formate-tetrahydrofolate ligase 


ORF01190 I 


SAG1056 


339 


ipoate-protein ligase A I 


ORF01191 


SAG1057 


292 


conserved hypothetical protein | 


ORF01192 


SAG1058 


272 


conserved hypothetical protein 


ORF01193 


SAG1059 


110 


glycine cleavage system H protein, putative 1 


I ORF01194 I 


SAG1060 


328 


bacterial luciferase family protein J 


ORF01195 


SAG 1061 


399 


oxidoreductase, FMN-binding j 


ORF01197 


SAG1062 


282 


lipoate-protein ligase A family protein 1 


t ORF01198 


SAG1063 


228 


flavoprotein-related protein 1 


ORF01199 


SAG1064 


180 


flavoprotein family protein J 


ORF01200 


| SAG1065 


190 


membrane protein, putative 1 


ORF01201 


j SAG1066 


572 


phosphoglucomutase | 


i ORF01202 


! SAG1067 


178 


IS861 , transposase OrfA j 


ORF01203 


| SAG1068 


277 


IS861 , transposase OrfB _J 


ORF01204 


SAG1069 


65 


hypothetical protein ! 


ORF01205 


SAG1070 


577 


ABC transporter, ATP-binding/permease protein I 


ORF01206 


I SAG1071 


573 


ABC transporter, ATP-binding/permease protein j 


I ORF01207 


i SAG1072 


j 200 


conserved hypothetical protein | 


j ORF01208 


SAG1073 


325 


conserved hypothetical protein j 


| ORF01209 


SAG1074 


418 


Serine hydroxymethyltransferase ^1 


| ORF01210 


f SAG1075 


183 


SuaS/YciOTYrdC/YwIC family protein | 


; ORF01211 


j SAG1076 


276 


modification methylase, HemK family 1 


I ORF01212 


SAG1077 


| 359 


peptide chain release factor 1 j 


ORF01213 


SAG1078 


189 


thymidine kinases I 


f ORF01214 


I SAG1079 


60 


4-oxaIocrotonate tautomerase | 


ORF01215 


| SAG1080 


47 


hypothetical protein 2j 


I ORF01216 


SAG1081 


312 


ApbE family protein I 


\ ORF01217 


SAG1082 


! 200 


conserved hypothetical protein j 


[ ORF01218 


SAG1083 


411 


conserved hypothetical protein f 


j ORF01219 


j SAG1084 


262 


formate/hitrit transporter family protein j 


\ ORF01220 


! SAG1085 


j 424 


xanthine permeas l 


j ORF01221 


i SAG1086 


193 


xanthine phosphoribosyltransferase 


ORF01222 


SAG1087 


327 


guanosine monophosphat reductase 1 


ORF01223 


SAG1088 


446 


drug resistance transporter, EmrB/QacA family, 
putative | 
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Tabl 



32: Conv rsion of ORF R f N s. with SAG R f 



f Nos. 



j ORF Ref No. 1 


SAGxxxx Ref No. I 


aa 


Ann tation 


ORF01224 


SAG1089 i 


230 


conserved hypothetical protein 


| ORF01225 


SAG1090 


666 


potassium uptak protein, putative 



lfamily, FRAMESHIFT 



ORF01227 



SAG1092 



330 | phosphate acetyitransferase 



ORF01228 



SAG1 093 | 294 | ribosomal large subunit pseudouridine synthase, RluD 
| subfamily 



ORF01229 



SAG1094 



278 | conserved hypothetical protein 



ORF01230 



SAG1095 



223 | GTP pyrophosphokinase family protein 



ORF01231 



ORF01232 



SAG1096 



190 | conserved hypothetical protein 



SAG1097 



324 I ribose-phosphate pyrophosphokinase 



ORF01233 



SAG1098 



371 | cysteine desulphurase 



ORF01234 



SAG1099 



115 | conserved hypothetical protein 



ORF01235 



SAG1100 



210 | DNA-binding protein 



ORF01236 



SAG1101 



226 | DNA repair protein RadC 



ORF01237 



SAG1102 



377 | membrane protein, putative 



ORF01238 



SAG1103 | 478 \ 6-phospho-beta-glucosidase 
SAG1104 I 204 l platelet activating factor, putative 



ORF01239 



ORF01240 



SAG1 105 I 273 j hydrolase, haloacid dehalogenase-like family 
SAG 1106 I 309 I transcriptional regulator, AraC family, putative 



ORF01241 



ORF01242 



SAG1107 



51 0 \ voltage-gated chloride channel family protein 



ORF01243 



SAG1 1 08 j 357 I spermidine/putresclne ABC transporter, 
Ispermidine/putrescine-binding protein 



ORF01244 



SAG1109 | 258 | spermidine/putresclne ABC transporter, permease 
| protein 



ORF01245 



SAG1110 } 264 | spermidine/putresclne ABC transporter, permease 
Iprotein 



ORF01246 



SAG1111 | 384 [ spermidine/putrescine ABC transporter, ATP-binding 
iprotein 



ORF01247 



ORF01248 



SAG1112 



300 I UDP-N-acetylenolpyruvoylglucosamine reductase 



SAG1113 | 162 1 2-amino-4-hydroxy-6-hydroxymethyldihydropteridine 
pyrophosphokinase 



ORF01249 



SAG1114 



1 20 | dihydroneopterin aldolase 



ORF01250 



SAG1115 



267 j dihydropteroate synthase 



ORF01251 



SAG1116 



1 87 | GTP cyclohydrolase 1 



ORF01252 



SAG1117 



420 j folylpolyglutamate synthase 



ORF01253 



SAG1118 



295 I rarP protein 



ORF01254 



ORF01255 



SAG1119 



288 I homoseiine kinase 



SAG1120 



427 | homoseiine dehydrogenase 



ORF01256 



SAG1121 



295 I polysaccharide deacetylase family protein 



ORF01257 



SAG1122 



51 5 I transporter, BCCT family protein 



ORF01258 



SAG1123 



34 | hypothetical protein 



ORF01259 



SAG1124 



458 | aldehyde dehydrogenase family protein 



ORF01260 



SAG1125 



335 | membrane protein 



ORF01261 



SAG1126 | 228 j conserved hypothetical protein 

SAG1127 I 113 I conserved hypothetical protein, FRAMESHIFT 



ORF01262 



ORF01263 



ORF01264 



ORF01265 



187 j hypothetical protein 



SAG1128 



65 l transcriptional regulator, Cro/CI family 



SAG1129 



36 I hypothetical protein 



ORF01266 



ORF01268 



SAG1130 



49 | hypothetical protein 



SAG1131 



164 | thiol peroxidase 



ORF01269 



SAG1132 



219 | conserved hypothetical protein 



ORF01272 



SAG1133 



254 |cons rved hypothetical protein 
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Tabl 32: C nversi nofORFRefN s. with SAG Ref Nos. 



1 ORF Ref No. 


SAGxxxx R f No. 


aa / 


annotation 


ORF01273 

I 


SAG1134 


213 1 

L 


transcriptional regulator, GntR family/potassioum 
iptake protein, TrkA family 


ORF01274 


SAG1135 


183 


gls24 protein, putative 


! ORF01275 


SAG1136 




conserved hypothetical protein FRAMESHIF r 


ORF01276 


SAG1137 


180 


gls24 protein, putative 


ORF01277 


SAG1138 


64 


conserved hypothetical protein 


ORF01279 


SAG1139 


193 


conserved hypothetical protein 


ORF01280 


SAG1140 


82 


conserved hypothetical protein 


ORF01281 


SAG1141 


112 


conserved hypothetical protein 


ORF01282 


SAG1142 


759 | 


ATP-dependent DNA helicase PcrA 


ORF01283 


SAG1143 


100 


conserved hypothetical protein, FRAMESHIFT 


ORF01284 


SAG1144 


441 


uracil permease 


j ORF01285 


SAG1145 


448 


sodiumialanine symporter family protein 


ORF01286 


SAG1146 


411 


cation efflux family protein 


ORF01287 


SAG1147 | 


130 


conserved hypothetical protein , 


ORF01288 


SAG1148 


231 


membrane protein, putative 


! ORF01289 


SAG1149 


207 


conserved hypothetical protein j 


ORF01290 


SAG1150 


400 


ribosomal protein S1 


ORF01291 


SAG1151 


76 


conserved hypothetical protein 


| ORF01292 


SAG1152 


340 


branched-chain amino acid aminotransferase 


| ORF01294 


SAG1153 


819 


DNA topoisomerase IV, A subunit 


ORF01295 


SAG1154 


653 


DNA topoisomerase IV, B subunit 


ORF01296 


SAG1155 


207 


conserved hypothetical protein TIGR00023 j 


{ ORF01297 


SAG1156 


217 


uracil-DNA glycosylase 


\ ORF01298 


SAG1157 


161 


conserved hypothetical protein | 


| ORF01299 


SAG1158 


413 


CMP-N-acetylneuraminic acid synthetase NeuA 


ORF01300 


SAG1159 


209 


neuD protein j 


) ORF01301 


SAG1160 


384 


UDP-N-acetylglucosamine-2-epimerase NeuC 


( ORF01302 


SAG1161 


341 


N-acetyl neuramic acid synthetase NeuB 


ORF01303 


SAG1162 


466 


cpsL protein 


ORF01304 


SAG1163 


318 


cpsVK protein 


ORF01305 


SAG1164 


321 


cpsVJ protein 


ORF01306 


SAG1165 


327 


cpsVO protein 


ORF01307 


SAG1166 


295 


cpsVN protein 


\ ORF01308 


SAG1167 


241 


cpsVM protein 


ORF01309 


SAG1168 


364 


cpsVH protein 


ORF01310 


SAG1169 


163 


CpsVG 


ORF01311 


SAG1170 


149 


CpsF 


ORF01312 


SAG1171 


462 


CpsE 


ORF01313 


SAG1172 


229 


CpsD protein 


ORF01314 


SAG1173 


230 


cpsC protein j 


j ORF01315 


SAG1174 


243 


capsular polysaccharide biosynthesis protein CpsB 


ORF01316 

I 


SAG1175 


485 


capsular polysaccharide biosynthesis protein CpsA 


ORF01317 


SAG1176 


290 


capsular polysaccharide synthesis operon 


j ORF01318 


SAG1177 


255 


cpslaS protein 


ORF01319 


SAG1178 


236 


purine nucleoside phosphorylase 


ORF01320 


SAG1179 


418 


voltage-gated chloride channel family protein, putative 


i ORF01321 


SAG1180 


269 


purine nucleoside phosphorylase 


i ORF01322 


SAG1181 


135 


arsenate reductase 


ORF01323 


SAGt182 


403 


phosphopentomutase 


ORF01324 


SAG1183 


223 


ribose 5-phosphate isomerase 
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Tabl 32: Conversion of ORF Reff Nos. with SAG R f Nos. 



1 ORF Ref No. I * 


SAGxxxx Ref No. I 


aa lA 


nnotation 1 


ORF01326 


SAG1184 


236 Ic 


unserved hypothetical protein J 


ORF01327 


SAG1185 


262 t 


ributyrin esterase I 


ORF01328 j 


SAG1186 


553 I i 


netallo-beta-lactamase superfamily protein | 


ORF01329 


SAG1187 


253 \i 


\BC transporter, ATP-blnding protein I 


ORF01330 5 


SAG1188 


287 L 


«VBC transporter, permease protein J 


S ORF01331 3 


SAG1189 


334 |< 


conserved hypothetical protein 1 


i ORF01332 [ 


SAG1190 


551 | adherence and virulence protein A I 


ORF01333 


SAG1191 


239 j 


alpha-acetolactate decarboxylase 


ORF01334 


SAG1192 


560 


acetolactate synthase, catabolic _ j 


J ORF01335 


SAG1193 


408 I 


TPR domain protein i 


ORF01336 


SAG1194 


396 I 


membrane orotein I 


| ORF01337 j 


SAG1195 i 


153 I 




ORF01338 I 


SAG1196 j 


luw 1 


mutator MutT orotein S 


f ORF01339 ! 


SAG1197 


1072 


hyaluronidase 1 


\ ORF01340 I 


SAG1198 


348 | 


O l U n-y lucose «»,»ru6nyuittiaac \ 


l ORF01341 I 


SAG1199 j 


197 j 


a i ijn*^f-Q©nyororneirni woo o,o**x?pn i i»i aoo ? 


ORFQ1342 I 


SAG1200 I 


AAA 1 

289 | 


QlUCOSe-l^pnOSpnai© uiymiviyiyiuaiioidci^w i 


ORF01343 I 


SAG1201 


367 | 


iminooj acetate oxiaas©, puwuve i 


ORF01344 | 


SAG1202 | 


262 | 


conserved nypoineucai pi mem i i\jr\uunuw i 


I ORF0134S 


SAG1203 


227 | 


conserveu nypouieiiuoi fjiuioiii i 


ORF01346 I 


SAG1204 


One 1 


UlMM rCPilCaUOIl fJIWlClll wiicnJ, puiaimw a 


j ORF01347 | 


SAG1205 | 


A TO 

172 1 




I ORF01348 


SAG1206 lj 


854 


conserved domain protein 1 


| ORF01349 


SAG1207 j 


32 


hypothetical protein I 


I ORF01350 S 


SAG1208 | 


732 


single-stranded-DNA-specific exonuclease RecJ j 


S ORF01351 


SAG1209 j 


253 


oxidoreductase, short chain dehydrogenase/reductase I 
family 1 


I ORF01352 


| SAG1210 


309 


1 metallo-beta-Iactamase superfamily protein | 


I ORF01353 


I SAG1211 


215 


1 conserved hypothetical protein J 


I ORF01354 


I SAG1212 


412 


1 GTP-binding protein HflX 1 


I ORF01355 


SAG1213 


296 


1 tRNA delta(2)-isopentenylpyrophosphate transferase 1 


I ORF01356 


| SAG1214 


I 58 


hypothetical protein j 


I ORF01357 


| SAG1215 


I 305 


1 exfoliative toxin A, putative 1 


ORF01358 


| SAG1216 


1252 


1 pullulanase, putative 1 


ORF01361 


SAG1217 




1 conserved hypothetical protein, FRAMESHIr i \ 


ORF01362 


l SAG1218 


194 


1 conserved hypothetical protein 1 


ORF01363 


| SAG1219 


I 468 


peptidase, M20/M25/M40 family I 


ORF01364 


j SAG1220 


I 200 


1 nitroreductase family protein 1 


| ORF01365 


SAG1221 


j I glycerophosphoryl diester phosphodiesterase, 
I loutative, POINT MUTATION 


ORF01367 


j SAG1222 


I 593 


1 excinuclease ABC, C subumt 1 


ORF01368 


i SAG1223 


255 


1 conserved hypothetical protein 


ORF01369 


SAG1224 


j 446 


MATE efflux family protein J 


ORF01370 


SAG1225 


S 136 


1 conserved hypothetical protein 1 


ORF01371 


SAG1226 


| 165 


1 conserved hypothetical protein 1 


ORF01372 




198 


| conserved hypothetical protein _ 1 


ORF01373 


SAG1228 


•I 96 I ISSdyl ( transposase Orf A J 


ORF01374 


i SAG1229 


259 


| ISSdyl , transposase OrtB 1 


ORF01375 


! SAG1230 


1 96 


1 conserved hypothetical protein j 


ORF01377 


SAG1231 




j transposase OrfB, IS3 family, degenerate 1 
FRAMESHIFT 


ORF01379 


I SAG1232 


ii 77 


| transposase OrfB, IS3 family, truncation 1 


ORF01380 


< SAG1233 


I 822 


1 streptococcal hlstidine triad family protein 


ORF01381 


I SAG1234 


| 306 


I laminin-binding surface protein 1 
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Table 32: Conv rsion of ORF Ref Nos. with SAG R f Nos. 



| ORF Ref No. 1 * 


SAGxxxxR f No. | 


aa A 


inn tati n 


ORF01382 


SAG1235 | 


425 < 


3BSI1. group II intron, maturase 


| ORF01383 


SAG1236 I 


i 


:5a peptidase precursor t-KAMESHit- 1 


I ORF01384 


SAG1237 


444 


hypothetical protein 


ORF01385' I 


SAG1238 


202 


hypothetical protein 


I ORF01386 


SAG1239 


76 < 


conserved hypothetical protein 


ORF01387 ; 


SAG1240 


125 


conserved hypothetical protein, truncation 


j ORF01388 \ 


SAG1241 ! 


78 


transposase OrfA, IS3 family 


\ ORF01389 


SAG1242 


67 


transposase OrfB, IS3 family, truncation 


ORF01390 


SAG1243 j 


96 [ 


ISSdyl, transposase OrfA FRAMESHIFT 


\ ORF01391 3 


SAG1244 I 


259 I 


ISSdyl , transposase OrfB 


\ ORF01392 { 


SAG1245 !| 


38 


hypothetical protein 1 


ORF01393 \ 


SAG1246 


389 


hypothetical protein 


\ ORF01394 


SAG1247 j 


399 


integrase, phage family 


ORF01395 | 


SAG1248 


75 


conserved hypothetical protein 


ORF01396 | 


SAG1249 j 


74 


transcriptional regulator, Cro/CK family 


ORF01397 


SAG1250 i 


621 


Tn5252, relaxase 


ORF01398 I 


SAG1251 | 


121 


Tn5252, Orf 9 protein 


ORF01399 


SAG1252 | 


120 


Tn5252, Orf 10 protein 


ORF01401 | 


SAG1253 ; 


435 


transposase, ISL3 family 


I ORF01403 | 


SAG1254 


546 


mercuric reductase 


i ORF01404 I 


SAG1255 


130 


mercuric resistance operon regulatory protein MerR 


ORF01406 i 


SAG1256 


142 


IS861. transposase OrfB, truncation 


i ORF01407 | 


SAG1257 i 


709 


cation-transporting ATPase, E1-E2 family 


I ORF01408 If 


SAG1258 


122 


cadmium efflux system accessory protein 


ORF01409 


SAG1259 


99 


conserved hypothetical protein 


) ORF01410 


SAG1260 


| 262 


hypothetical protein 


ORF01411 


SAG1261 


! 198 


conserved hypothetical protein 


ORF01412 


SAG1262 


) 695 


cation-transporting ATPase, E1-E2 family 


| ORF01414 


SAG1263 




conserved domain protein, FRAMESHIFT 


ORF01415 


SAG1264 


148 


transcriptional repressor CopY, putative 


| ORF01416 


I SAG1265 


206 


cadmium resistance transporter, putative 


ORF01417 


l SAG1266 


152 


hypothetical protein 


j ORF01418 


| SAG1267 


108 


hypothetical protein ! 


ORF01419 


SAG1268 


230 


repressor protein, putative 


I; ORF01420 


J SAG1269 


; 44 


hypothetical protein 


ORF01421 


S SAG1270 


I 471 


ImpB/MucB/SamB family protein 


| ORF01423 


j SAG1271 


116 


conserved hypothetical protein 


f ORF01424 


SAG1272 


102 


conserved hypothetical protein 


| ORF01425 


} SAG1273 


118 


conserved hypothetical protein 


I ORF01426 


SAG1274 


f 129 


conserved hypothetical protein 


ORF01427 


\ SAG1275 


75 


hypothetical protein 


I ORF01428 


j SAG1276 


358 


conserved hypothetical protein 


! ORF01430 


\ SAG1277 


} 163 


hypothetical protein 


ORF01431 


SAG1278 


96 


hypothetical protein 


I ORF01432 


SAG1279 


99 


conserved domain protein 


j ORF01433 


| SAG 1280 


I £.£.(** 


neiicascs codocivcq o-ierminai uuniaiii piuicm 


i ORF01434 


SAG1281 


183 


hypothetical protein 


ORF01435 


SAG1282 


63 


lipoprotein, putative 


| ORF01436 


SAG1283 


1631 


cell wall surface anchor family protein 


ORF01437 


SAG1284 


| 196 


abortive infection protein AbiGI 


! ORF01438 


S SAG1285 


281 


abortive infection protein AbiGII 


ORF01439 


\ SAG1286 


| 933 


conserved hypothetical protein 


5 ORF01440 


I SAG1287 


I 776 


conserved hypothetical protein 


| ORF01441 


SAG1288 


| 117 


conserved hypothetical protein, DEGENERATE 
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Tabl 32: Conversion of ORF R f Nos. with SAG RefNs. 



ORFRefN . 


SAGxxxxR f No. 


aa > 


Annotation 


ORF01442 


SAG1289 


284 


conserved hypothetical protein 


ORF01443 


SAG1290 


80 


hypothetical protein 


ORF01444 


SAG1291 


605 


Tn5252, Orf 21 protein, internal deletion 


ORF01445 


SAG1292 


162 


hypothetical protein 


| ORF01446 


SAG1293 


194 


protease, putative 


ORF01447 


SAG1294 


77 


conserved hypothetical protein 


ORF01449 


SAG1295 


127 


conserved hypothetical protein 


ORF01450 


. SAG1296 


142 


conserved hypothetical protein 


ORF01451 


SAG1297 


451 


type 11 DNA modification methyltransferase Spn5252IP 


ORF01452 


SAG1298 


31 


hypothetical protein 


ORF01453 


SAG1299 


272 


conserved hypothetical protein 


ORF01454 


SAG1300 


57 


conserved hypothetical protein 


ORF01455 


SAG1301 


121 


ribosomal protein L7/L1 2 ! 


ORF01456 


SAG1302 


166 


ribosomal protein L10 


ORF01458 


SAG1303 


702 


ATP-dependent Clp protease, ATP-binding subunit 


ORF01459 


SAG1304 


32 


hypothetical protein 


ORF01460 


SAG1305 


314 


homocysteine S-methyltransferase MmuM, putative 


ORF01461 


SAG1306 


458 


amino acid permease 


ORF01463 


SAG1307 


216 


hypothetical protein 


ORF01464 


SAG1308 


167 


hypothetical protein 


ORF01465 


SAG1309 


30 


hypothetical protein 


ORF01466 


SAG1310 


182 


transcriptional regulator, TetR family 


ORF01467 


SAG1311 


198 


GTP-binding protein 


ORF01468 


SAG1312 


408 


ATP-dependent Clp protease, ATP-binding subunit 
CIpX 


| ORF01469 


SAG1313 


56 


conserved hypothetical protein 


ORF01470 


SAG1314 


164 


dihydrofolate reductase 


ORF01471 


SAG1315 


279 


thymidylate synthase 


ORF01472 


SAG1316 


390 


HMG-CoA synthase j 


ORF01473 


SAG1317 


427 


3-hydroxy-3-methylglutaryl-CoA reductase 


ORF01474 


SAG1318 


149 


conserved hypothetical protein 


ORF01475 


SAG1319 


187 


hemolysin III, putative 


ORF01476 


SAG1320 


304 


conserved hypothetical protein TIGR00147 


ORF01477 


SAG1321 


284 


glutathione S-transferase family protein 


ORF01478 


SAG1322 


72 


conserved domain protein j 


ORF01479 


SAG1323 


331 


isopentenyl-diphosphate delta-isomerase 


ORF01480 


SAG1324 


330 


phosphomevalonate kinase 


ORF01481 


SAG1325 


314 


diphosphomevalonate decarboxylase 


ORF01482 


SAG1326 


I 292 


mevalonate kinase, putative 


ORF01483 


SAG1327 


409 


sensor histidme kinase 


ORF01484 


SAG1328 


228 


DNA-binding response regulator 


ORF01485 


SAG1329 


208 


GTP pyrophosphokinase family protein 


ORF01486 


SAG1330 


68 


hypothetical protein 


Ur\rU14oo 


CAP1 QH 


Q7Q 


DC nr/%tAln 

r\o proiein 


ORF01489 


SAG1332 


! 146 


transcriptional regulator, MarR family, putative 


ORF01490 


SAG1333 


690 


^-nucleotidase family protein 


ORF01491 


SAG1334 


136 


polypeptide deformylase, putative 


ORF01492 


SAG1335 


449 


NADP-specific glutamate dehydrogenase 


ORF01494 


SAG1336 


169 


conserved hypothetical protein 1 


ORF01495 


SAG1337 


589 


ABC transporter, ATP-blnding/permease protein 


ORF01496 


8AG1338 


579 


ABC transporter, ATP-binding/permease protein 


ORF01497 


SAG1339 


157 


acetyltransferase, GNAT family 
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Tab! 32: Conversion of ORF Ref N s. with SAG RefNos. 



1 ORF Ref No. 


SAGxxxxR f No. 


aa / 


Annotation I 


ORF01498 


SAG1340 


622 i 


\BC transporter, ATP-binding protein i 


| ORF01499 


SAG1341 


402 


poiyA polymerase family protein I 


S ORF01500 


SAG1342 


282 ! 


DegV family protein f 


ORF01501 


SAG1343 


126 


conserved hypothetical protein | 


ORF01502 


SAG1344 | 


177 


hypothetical protein _J 


ORF01503 


SAG1345 


164 


conserved hypothetical protein 


I ORF01504 


SAG1346 


641 


PTS system, fructose specific UABC components I 


ORF01505 


SAG1347 


303 j 


1 -phosphofructokinase j 


ORF01506 


SAG1348 


247 


lactose phosphotransferase system repressor 


i ORF01507 


SAG1349 


411 


beta-Iactam resistance factor j 


ORF01508 


SAG1350 


544 


surface antigen-related protein j 


ORF01509 


SAG1351 | 


307 


2-dehydropantoate 2-reductase, putative 


ORF01510 


SAG1352 


366 


regulatory protein, putative Zj 


ORF01511 


SAG1353 


330 


pyridine nucleotide-disulphide oxidoreductase family 1 
protein I 


ORF01512 


SAG1354 


251 


tRNA (guanlne-NI)-methyltransferase ^_J 


ORF01513 


SAG1355 j 


172 


16S rRNA processing protein RimM 


ORF01515 | 


SAG1356 


503 


transcriptional regulator, RofA family | 


ORF01516 


SAG1357 


80 


KH domain protein i 


( ORF01517 


SAG1358 


90 


ribosomal protein S1 6 I 


l ORF01518 


SAG1359 


415 


permease, putative I 


ORF01519 


SAG1360 


236 


ABC transporter, ATP-binding protein | 


ORF01520 


SAG1361 


414 


conserved hypothetical protein f 


ORF01522 


SAG1362 


532 


carbamoyl-phosphate synthase, large subunit, putative 

j 


ORF01523 


SAG1363 


356 


carbamoyl-phosphate synthase, small subunit I 


ORF01524 


SAG1364 


173 


pyrimldine operon regulatory protein I 


ORF01525 

t 


SAG1365 


296 


ribosomal large subunit pseudouridine synthase, RluD j 
subfamily I 


! ORF01526 


SAG1366 


154 


lipoprotein signal peptidase 


ORF01527 


SAG1367 


301 


transcriptional regulator, LysR family I 


ORF01528 


SAG1368 


94 


ribosomal protein L27 I 


ORF01529 


SAG1369 


112 


consented hypothetical protein | 


ORF01530 


SAG1370 


104 


ribosomal protein L21 1 


\ ORF01531 


SAG1371 


392 


conserved hypothetical protein 1 


ORF01532 


SAG1372 


404 


thiamine biosynthesis protein Thil I 


| ORF01533 


SAG1373 


381 


cysteine desulphurase 1 


ORF01535 


SAG1374 


150 


conserved hypothetical protein 1 


j ORF01536 


| SAG1375 


449 


glutathione reductase 1 


ORF01637 


SAG1376 


111 


conserved hypothetical protein 1 


ORF01538 


SAG1377 


388 


chorismate synthase 1 


; ORF01539 


SAG1378 


355 


3-dehydroquinate synthase { 


ORF01540 


SAG1379 


225 


3-dehydroquinate dehydratase j 


ORF01541 


SAG1380 


385 


conserved hypothetical protein 1 


ORF01542 


SAG1381 


714 


sulfatase I 




SAG1382 


119 


ribosomal protein L20 1 


ORF01544 


SAG1383 


66 


ribosomal protein L35 | 


ORF01645 


SAG1384 


176 


translation initiation factor IF-3 8 


ORF01646 


SAG1385 


227 


cytidylate kinase J 


ORF01547 


SAG1386 


174 


conserved hypothetical protein 


ORF01548 


SAG1387 


65 


ferredoxin, 4Fe-4S j 


ORF01549 


SAG1388 


163 


conserved hypothetical protein 


ORF01550 


SAG1389 


406 


peptidase t J 


ORF01551 


SAG1390 


544 


polysaccharide biosynthesis protein, putative | 
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Tabl 32: Conv rsionofORFR f Nos. with SAG R fNos. 



ORFR fN . I j 


5AGxxxx Ref No. I 


aa | Annotation ! 


ORF01 5S2 I 


SAG1391 


484 I UDP-N-acetylmuramoylalanyt-D-giutamate-2,6- | 




Idiaminopimelat tigase I 


ORF01553 I 


SAG1392 




on comoound ABC transporter. ATP-binding protein I 


ORF01554 I 


SAG1393 I 


310 I ii 


ron compound ABC transporter, substrate-binding 1 




jp 


roteln 1 


j ORF01555 J 


SAG1394 I 


341 i 


ron compound ABC transporter, permease protein 1 

I 


ORF01556 


SAG1395 | 


333 i 


ron compound ABC transporter, permease protein 1 

j 


| ORF01557 


SAG1396 


217 Y 


conserved hypothetical protein . 1 


i ORF01558 


SAG1397 


311 i 


inorganic pyrophosphatase, manganese-dependent j 


ORF01559 


SAG1398 i 


262 l 


pyruvate formate-lyase-activating enzyme j 


ORF01560 I 


SAG1399 


444 


CBS domain protein 1 


ORF01561 I 


SAG1400 


188 I 


conserved hypothetical protein 1 


I ORF01563 


SAG1401 


311 


conserved hypothetical protein TIGR01212 ] 


I ORF01564 I 


SAG1402 


213 


PAP2 family protein I 


} ORF01565 I 


SAG1403 


194 


membrane protein, putative ^ J 


J ORF01566 f 


SAG1404 


308 


cell wall surface anchor family protein j 


0RF01567 J 


SAG1405 j 


294 


sortase family protein 1 


| ORF01568 | 


SAG1406 S 


293 


sortase family protein 1 


ORF01569 j 


SAG1407 J 


705 | 


cell wail surface anchor family protein | 


| ORF01570 I 


SAG1408 


901 j 


cell wall surface anchor family protein | 


ORF01571 


SAG1409 


326 


transcriptional regulator, RofA family FRAMESHIFT J 

1 


ORF01572 j 


SAG1410 \ 


379 \ 


glycosyl transferase, group 1 family protein J 


1 ORF01573 \ 


\ SAG1411 


| 282 j 


exopolysaccharide biosynthesis protein, putative I 


I ORF01574 \ 


! SAG1412 


I 474 j 


exopolysaccharide biosynthesis protein, putative ~1 


I ORF01575 | 


SAG1413 


| 454 | 


hypothetical protein I 


| ORF01576 


l SAG1414 


308 


glycosyl transferase, group 2 family protein I 


I ORF01577 


| SAG1415 


I 311 


I glycosyl transferase, group 2 family protein J 


j ORF01578 


I SAG1416 


I 352 


I dTOP-glucose 4,6-dehydratase, putative j 


ORF01579 


SAG1417 


240 


1 4-diphosphocytidyl-20methyl-r>erythritol synthase, I 






putative | 


j ORF01580 


SAG1418 


259 


I iicD protein, putative I 


I ORF01581 


SAG1419 


I 577 


I hypothetical protein I 


} ORF01582 


SAG1420 


J 117 


I conserved hypothetical protein I 


\ ORF01583 


SAG1421 


243 


I glycosyl transferase, group 2 family protein I 


I ORF01584 


\ SAG1422 


313 


I glycosyl transferase, group 2 family protein J 


ORF01585 


SAG1423 


I 384 


I conserved hypothetical protein j 


ORF01586 


I SAG1424 


284 


I dTDP-4-dehydrorhamnose reductase I 


[ ORF01587 


SAG1425 


I 113 


conserved hypothetical protein j 


ORF01589 


\ SAG1426 


j 369 


RNA polymerase sigma-70 factor I 


ORF01590 


ij SAG1427 


| 602 


1 DNA pnmase — I 


ORF01591 


SAG1428 


125 


J large conductance mechanosensitive channel protein J 


1 Ur\rU l 




I 58 


I ribosomal protein S21 I 


ORF01593 


i SAG1430 


167 


I conserved hypothetical protein I 


\ ORF01594 


\ SAG1431 


268 


I amino acid ABC transporter, amino acid-binding 








jprotein i 


I ORF01596 


I SAG1432 


347 


j ammonium transporter family protein j 


ORF01597 


SAG1433 


I 375 


j conserved hypothetical protein 1 


ORF01598 


I SAG1434 


328 


rhodanese family protein 1 


ORF01599 


SAG1435 


j 101 


conserved hypothetical protein I 


j ORF01600 


| SAG1436 


| 457 


1 g|ycerol-3-phosphate transporter, putative _J 
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Table 32: Conversion of ORF Ref Nos. with SAG Ref Nos. 



ORF R f No. 1 SAGxxxx Ref N . 1 


aa (Annotation | 




SAG1437 


55 j 


hypothetical protein 




SAG1438 


754 | glycogen phosphorylase 


nppnifin^ 1 


SAG1439 


498 |< 


4-alpha-glucanotransferase 1 


OR PA 1 AHA I 


SAG1440 


342 


naltose operon repressor MalR. putative j 


VJrvru I DUO I 


SAG1441 


415 

r 


maltose/maltodextrin ABC transporter, 1 
naltose/maltodextrirv-binding protein 1 


ORF01606 I 


SAG1442 


456 I 


maltose ABC transporter, permease protein j 


ORF01607 I 


SAG1443 


278 | 


maltose ABC transporter, permease protein j 


ORF01608 9 


SAG1444 I 


490 I 


proton/peptide symporter family protein I 


ORF01610 I 


SAG1445 | | 


MutT/nudix family protein, FRAMESHIFT 1 


ORF01611 I 


OAf24AAR I 


62 I 


hypothetical protein 1 


ORF01612 I 
urvru iu ii> | 


SAG1447 ; 


441 


conserved hypothetical protein 1 


ORF01613 I 


SAG 1448 


cno 1 
| 


glycosyl transferase, group 1 family protein ! 


\JT\1r\J 1 0 1 *f | 


SAG1449 


795 


preprotein translocase SecA subunit, putative | 


v^rxi u id io | 


SAG1450 


330 1 


conserved domain protein | 


UrVru IO 1 1 | 


SAG1451 


494 


conserved hypothetical protein I 


I flRPftifiiA 1 


SAG1452 


514 | 


conserved hvoothetical orotein I 


ORF01619 l 


SAG1453 | 


409 j 




| ORF01621 1 


SAG1454 | 


398 ! 


^iUI lOCl VCU liypUUICUVOI \** Vfc**" ■ g 


i ORF01622 | 


SAG1455 I 


295 


glycosyl transferase, group 2 family protein 1 


i| ORF01623 f 


SAG1456 


312 


glycosyl iransierase, mnmy ©» ucywiwaw § 


S ORF01624 


SAG1457 | 


129 


| IS1 381 , transposase OrfB I 


ORF01625 




127 


| IS 1381, transposase OrfA 


I ORF01626 


SAG1459 


413 


| glycosyl transferase family 8 


I ORF01627 


SAG1460 


401 


| glycosyl transferase, family 8 


I ORF01628 


SAG1461 


335 


I conserved hypothetical protein 


ORF01630 


| SAG1462 


970 


I cell wall surtace ancnor lamuy proiein 


| ORF01632 


| SAG1463 




I transcnptionai regulator, rcoirv lamiiy ruin i 

IM1 ITATinM 


1 QRF01634 


I SAG1464 


663 




| ORF01635 


| SAG1465 


306 


1 protease, putative 


ORF01636 


SAG1466 


i 727 


j glutamine ABC transporter, glutamine-blnding 
Iprotein/permease protein, putative 


J ORF01637 


SAG1467 


I 246 


glutamine ABC transporter, ATP-binding protein. GlnQ 
I putative 


ORF01638 


SAG1468 


I 116 


1 conserved hypothetical protein 


| ORF01639 


SAG1469 


i 52 


| conserved hypothetical protein 


| ORF01640 


j SAG1470 


I 437 


GTP-binding protein, GTP1/Obg family 


I ORF01641 


! SAG1471 


I 42 


| conserved hypothetical protein 


I ORF01643 


If SAG1472 


413 


| aminopeptidase PepS 


ORF01645 


| SAG1473 


$ 192 


| cell wall surface anchor family protein 


f] ORF01646 


* SAG1474 


680 


I amidase family protein 


| ORF01647 


| SAG1475 


I 240 


I ribosomal small subunit pseudouridine synthase A 



ORF01648 



SAG1476 



280 1 oxidoreductase, aldo/keto reductase family^ 



ORF01650 



ORF01651 



ORF01652 



SAG1477 



224 | nitroreductase family protein 



SAG1478 



130 I lactoylglutathione lyase 



SAG1479 



308 l glycosyl transferase, group 2 family protein 



l ORF01653 I 


SAG1480 


I 462 I 


amino acid permease | 


I ORF01654 ] 


SAG1481 


155 ; 


SsrA-binding protein I 


ORF01655 


SAG1482 


! 801 ! 


exoribonuclease, VacB/Rnb family I 


ORF01657 


SAG1483 


I 78 


preprotein translocase, SecG subunit | 


ORF01658 


SAG1485 


i 389 


, multi-drug resistance protein I 


I ORF01660 


SAG1486 


548 


hypothetical protein S 


ORF01661 


I SAG1487 


| 233 


| ABC transporter, ATP binding protein | 
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Table 32: Conversion f ORF Ref N s. with SAG R f Nos. 



1 ORF Ref No. * 


SAGxxxx R f No. I 


aa \A 


jin tation j 


1 ORF01662 


SAG1488 


195 < 


jephospho-CoA kinase j 


j ORF01663 


SAG1489 


273 1 1 


r ormamldopyrimidine-DNA glycosylas | 


ORF01665 


SAG1490 


282 1 


janscriptional regulator, MutR family J 


ORF01666 


SAG1491 r 


530 


hypothetical protein I 


ORF01667 


SAG1492 


58 


hypothetical protein 


I ORF01668 


SAG1493 I 




hypothetical protein I 


ORF01670 


SAG1494 


32 


hypothetical protein I 


! ORF01672 


SAG1495 


81 


protease, putative, POINT MUTATION j 


i ORF01673 


SAG1496 


110 | 


hypothetical protein 


ORF01674 


SAG1497 


37 


hypothetical protein 1 


ORF01675 


SAG1498 


133 t 


hypothetical protein l 


] ORF01677 


SAG1499 


299 


GTP-binding protein Era 


! ORF01678 


SAG1500 


132 | 


diacylglycerol kinase f 


ORF01679 


SAG1501 | 


161 | 


conserved hypothetical protein TIGR00043 1 


ORF01680 j 


SAG1502 


268 


tetracenomycin poiyketide synthesis O- j 
methyltransferase TcmP, putative j 


ORF01681 


SAG1503 


39 


hypothetical protein 1 


ORF01682 


SAG1504 


38 | 


hypothetical protein j 


\ ORF01683 


SAG1505 i 


158 


MutT/nudix family protein j 


ORF01684 


SAG1506 I 


267 | 


hypothetical protein j 


:i ORF01685 


SAG1507 


345 


PhoH family protein ~~ J 


[ ORF01686 
J 


SAG1508 


590 


67 kDa Myosin-crossreactive streptococcal antigen 1 

1 


ORF01687 


SAG1509 1 


71 ; 


conserved hypothetical protein j 


ORF01688 


SAG1510 


169 


peptide methionine sulfoxide reductase 1 


ORF01689 


SAG1511 


284 


conserved hypothetical protein J 


ORF01690 


SAG1512 


185 


ribosome recycling factor 1 


ORF01691 


SAG1513 


242 


uridylate kinase 1 


ORF01692 


SAG1514 


I 226 


peptide ABC transporter, ATP-binding protein j 


ORF01693 


SAG1515 


I 262 


1 peptide ABC transporter, ATP-binding protein 1 


! ORF01694 


SAG1516 


I 255 


| peptide ABC transporter, permease protein j 


ORF01695 


SAG1517 


314 


| peptide ABC transporter, permease protein 1 


[ ORF01698 


SAG1518 


j 526 


| peptide ABC transporter, peptide-binding protein f 


ORF01697 


i SAG1519 


S 229 


1 ribosomal protein L1 I 


ORF01698 


SAG1520 


I 141 


I ribosomal protein L1 1 


ORF01699 


SAG1521 


j 388 


I transposase, IS30 family, putative j 


ORF01700 


SAG1522 


| 460 


I transporter, major facilitator family j 


ORF01702 


SAG1523 


404 


peptidase, M20/M25/M40 family 


! ORF01703 


SAG1524 


294 


I transcriptional regulator, LysR family | 


ORF01704 


SAG1525 


I 117 


1 conserved hypothetical protein i 


i ORF01705 


SAG1526 


178 


| IS861 , transposase OrfA I 


| ORF01706 


SAG1527 


I 277 


I IS861 , transposase OrfB I 


| ORF01707 


SAG1528 


j 571 


chorismate binding enryme I 


ORF01708 


SAG1529 


I 785 


FtsK/SpolllE family protein j 


| ORF01709 


SAG1530 


267 


I peptidyl-prolyl cis-trans isomerase, cyclophilin-type | 
I 


ORF01710 


SAG1531 


277 


I manganese ABC transporter, permease protein I 


ORF01711 


SAG1632 


238 


I manganese ABC transporter, at r-Dinamg proiein | 


\ ORF01712 


SAG1533 


308 


I manganese ABC transporter, manganese-binding 
adhesion liproteln 


i ORF01713 


. SAG1534 


215 


I iron-dependent transcriptional regulator 


ORF01714 


SAG1535 


229 


J 5-methylthioadenosine nucleosidase/S- 
ladenosylhomocysteine nucleosidase J 


ORF01715 


SAG1536 




I conserved hypothetical protein 


| ORF01716 


SAG1537 


I 184 


I MutT/nudix family protein I 
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Tabl 32: C nversion of ORF R ff Nos. with SAG RefN s. 



ORFRef No. 


SAGxxxx Ref No. 


aa ^ 


Annotation 


ORF01718 


SAG1538 


459 


UDP-N-acetylglucosamlne pyrophosphorylase 


ORF01719 


SAG1539 


31 


hypothetical protein 


ORF01720 


SAG1540 


137 


conserved hypothetical protein 


ORF01721 


SAG1541 


125 


glyoxalase family protein 


ORF01722 


SAG1642 


318 


oxldoreductase, Gfo/ldh/MocA family 


ORF01724 


SAG1543 




conserved hypothetical protein, FRAMfcsmf-r 


ORF01725 


SAG1544 


232 | 


gluconate 5-dehydrogenase, putative 


ORF01726 


SAG1545 


78 


conserved hypothetical protein 


ORF01727 


SAG1546 


82 


conserved hypothetical protein 


ORF01729 


SAG1647 


166 ; 


acetyltransferase, GNAT family 


ORF01730 


SAG1548 


422 


glycosyl transferase, group 2 family protein 


ORF01731 


SAG1549 


127 


IS1381, transposase OrfA 


ORF01732 


SAG1550 


129 


IS1381, transposase OrfB 


ORF01733 


SAG1551 


67 


hypothetical protein 


ORF01734 


SAG1552 


719 


conserved hypothetical protein 


ORF01735 


SAG1553 


477 


hypothetical protein 


ORF01736 


SAG1554 


225 


hypothetical protein 


ORF01737 


SAG1555 


231 


hypothetical protein 


ORF01738 


SAG1556 


445 


branched-chain amino acid transport system II carrier 
protein 


ORF01739 


SAG1557 


665 


methlonyl-tRNA synthetase 


ORF01740 


SAG1558 


291 


tellurite resistance protein TehB 


ORF01741 


SAG1559 


231 


membrane protein, putative 


ORF01742 


SAG1560 


40 


hypothetical protein 


ORF01743 


SAG1561 


405 


PTS system component, putative 


ORF01744 


SAG1562 


280 


conserved hypothetical protein 


ORF01745 


SAG1563 


275 


exodeoxyribonuclease 


ORF01746 


SAG1564 


118 


conserved hypothetical protein 


ORF01747 


SAG1565 


158 


methylated-DNA-protein-cysteine S-methyltransferase 


ORF01748 


SAG1566 


393 


D-isomer specific 2-hydroxyacid dehydrogenase family 
protein 


ORF01749 


SAG1567 


182 


acetyltransferase, GNAT family j 


ORF01750 


SAG1568 




phosphoserine aminotransferase FRAMESHIFT 


ORF01752 


SAG1569 


211 


copper homeostasis protein CutC, putative 


ORF01753 


SAG1570 


34 


conserved hypothetical protein 


ORF01764 


SAG1571 


53 


hypothetical protein i 


ORF01755 


SAG1572 


287 


tetrapyrrole methylase family protein 


ORF01756 


SAG1573 


108 


conserved hypothetical protein 


ORF01758 


SAG1574 


287 


DNA polymerase III, delta prime subunit, putative 


ORF01759 


SAG1575 


211 


thymidylate kinase 


ORF01761 


SAG1576 


267 


transposase, IS30 family, putative, truncation 


S ORF01763 


SAG1577 


219 


AcuB family protein 


ORF01764 


SAG1578 


236 


branched-chain amino acid ABC transporter, ATP- 
binding protein 


ORF01765 


SAG1579 


254 


branched-chain amino acid ABC transporter, ATP- jj 
binding protein 


ORF01766 


SAG1580 


317 


branched-chain amino acid ABC transporter, 
permease protein 


ORF01767 


SAG1581 


289 


branched-chain amino acid ABC transporter, 
permease protein 


ORF01769 


SAG1582 


388 


branched-chain v amino acid ABC transporter, amino 
acid-binding protein 


ORF01770 


SAG1583 


81 


conserved hypothetical protein 


ORF01772 


SAG1584 


377 


IS 1548, transposase 
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Tab! 32: Conv rsion of ORF Ref N s. with SAG Ref Nos. 



[ QRF Ref No. 1 j 


SAGxxxx Ref No. I 


aa A 


nn tati n I 


ORF01773 j 


SAG1585 


196 / 


VTP-dependent cip protease, proteoiyuc suDunu uipr i 


ORF01774 


SAG1586 


209 l 


jracil phosphoiibosyltransferase 


ORF01775 i 


SAG1587 i 


389 5 


aminotransferase, class 1 I 


ORF01777 


SAG1588 


182 I 


RNA methyltransf erase, TrmH family, group 2 1 


! ORF01778 I 


SAG1589 I 


450 i 


amino acid permease, putative 1 


ORF01779 


SAG1590 


449 | 


potassium uptake protein, Trk family j 


ORF01780 


SAG1591 \ 


475 « 


cation uptake protein, Trk family I 


\ ORF01781 | 


SAG1592 


83 


unserved hypothetical protein TlGRUUZftt | 


! ORF01782 j 
i I 


SAG1593 


240 


ribosomal large subunit pseudouridine synthase B I 


i ORF01783 


SAG1594 I 


194 


conserved nypometicai proiein i lotsuuzo i 


ORF01784 


SAG1595 | 


235 


Uncharactenzed agr, LUbidi34 i 


| ORF01785 


SAG1596 I 


246 


integrase/recombmase, pnage mtegrase lamuy \ 


ORF01786 I 


SAG1597 


157 


CBS domain protein j 


j ORF01787 


SAG1598 


173 


conserved hypothetical protein J 


! ORF01788 


SAG1599 


324 I 


HAM1 protein I 


I ORF01789 | 


SAG1600 


264 


glutamate racemase I 


| ORF01790 1 


SAG1601 j 


79 


conserved hypothetical protein _ I 


ORF01791 j 


SAG1602 I 


180 


membrane protein, putative _ j 


ORF01792 f 


SAG1603 


173 


transcriptional regulator, biotin repressor family 


ORF01793 


SAG1604 


229 


membrane protein, putative j 


ORF01794 


SAG1605 | 


167 


conserved hypothetical protein J 


I QRF01795 


SAG1606 ! 


247 


RNA methyltransferase, TrmH family I 


j ORF01796 


SAG1607 


92 


acylphosphatase I 


j ORF01797 


SAG1608 


310 


membrane protein, putative I 


ORF01799 


| SAG1609 


221 


amino acid ABC transporter, permease protein | 


j ORF01800 
I 


SAG1610 


285 


amino acid ABC transporter, substrate-binding protein I 
. _ j 


I ORF01801 


| SAG1611 


486 


amidase family protein 


I ORF01802 


I SAG1612 


I 160 


transcription elongation factor GreA I 


ORF01803 

I 


SAG1613 


600 


Uncharacterized BCR, YceG family COG1559, putativei 


I ORF01804 


I SAG1614 


j 167 


acetyltransferase, v-pNA I ramiiy j 


\ ORF01805 


j SAG1615 


I 443 


UDP-N-acetyimuramaxe— alanine ug^o i 


I ORF01806 


I SAG1616 


I 205 


conserved nypotneucai proiein j 


! ORF01807 


SAG1617 


j 32 


hypotneticai protein m i 


ORF01808 


j SAG1618 


| 1032 


f Jl.auT.ili. J Aln 1 

bntz family protein i 


l ORF01810 


I SAG1619 


I 377 


loiMo, transposase i 


l ORF01811 


] SAG1620 


| 436 


pnOSpnOQiyCwfcli" utsiiyvii uyoi iao «n vioivm j) 


I ORF01812 


I SAG1621 


| 300 


pnmosomai proiein undi ^ i 


I ORF01813 


| SAG1622 


i QH4 


conserved nypoineuuai piuicm i 


ORF01814 


| SAG 1623 


1 a cn 


mneoruoH hunrrfhatirfll DfOteln TIGR00244 1 
conserved nypuiiioijocii piuicn i ■ ■n^i xw*- -» ■■ ^» 


I ORF01815 


\ SAG1624 


1 CA4 


sensor rusuuine miwoo uaio ■ 


ORF01816 


SAG1625 


il OOQ 
1 




' ORF01817 


| SAG 1626 


j If/ 




I ORF01818 


j SAG1627 


S 296 


heat shock protein HtpX j 


ORF01820 


SAG1628 


I 184 


lemA protein | 


• ORF01821 


\ SAG1629 


237 


glucose-inhibited division protein B 1 


ORF01822 


| SAG1630 


459 


sodium transport family protein 1 


i ORF01823 


i SAG1631 


223 


potassium uptake protein, Trk family, putative 1 


ORF01824 


[ SAG1632 


I 276 


cobalt transport family protein 1 


i ORF01825 


* SAG1633 


l 558 


ABC transporter, ATP-bindlng protein 1 


ORF01826 


SAG1634 


212 


conserved hypothetical protein 1 


| ORF01827 


SAG1635 


I 402 


sodlumrdicarboxylate symporter family protein | 
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Tabl 32: Conversion f ORF Ref Nos. with SAG R fN s. 



1 ORF RefN . IS 


>AGxxxx Ref N . I 


aa {Annotation _l 


ORF01828 I 


SAG1636 


455 I branched-chain amino acid transport system II carrier I 




| protein ^ j 


ORF01829 


SAG1637 


351 Is 


ilcohol dehydrogenase, zinc-containing | 


ORF01830 


SAG1638 


230 / 


\BC transporter, permease protein ] 


ORF01831 


SAG1639 I 


356 / 


VBC transporter, ATP-bindlng protein f 


| ORF01832 


SAG1640 i 


458 f 


>eptldase, M20/M25/M40 family J^l 


ORF01833 \ 


SAG1641 


274 ! 


Ipoprotem.putative 1 


\ ORF01834 ) 


SAG1642 \ 


277 |V 


VBC transporter, substrate-binding protein 


I ORF01835 


SAG1643 


229 c 


jlutamine amidotransferase, class 1 


ORF01836 


SAG1644 


37 : 


hypothetical protein j 


ORF01837 


SAG1645 


238 < 


conserved hypothetical protein TIGR01033 I 


ORF01838 


SAG1646 


32 


hypothetical protein I 


| ORF01839 i 


SAG1647 j 


328 i 


dihydroxyacetone kinase family protein j 


ORF01840 f 


SAG1648 


178 1 


transcriptional regulator, TetR family, putative 


ORF01842 


SAG1649 


37 


hypothetical protein I 


I ORF01843 1 


SAG1650 


329 


dihydroxyacetone kinase family protein I 


ORF01844 


SAG1651 I 


192 


dihydroxyacetone kinase family protein I 


ORF01845 


SAG1652 | 


124 


conserved hypothetical protein | 


ORF01846 


SAG1653 i 


237 


glycerol uptake facilitator protein | 


j ORF01847 | 


SAG1664 | 


134 


conserved hypothetical protein | 


ORF01848 I 


SAG1655 j 


237 


transcriptional regulator, MerR family | 


ORF01849 j 


SAG1656 


369 


conserved hypothetical protein j 


! ORF01850 1 


SAG1657 


83 


hypothetical protein 


1 ORF01851 j 


SAG1658 


I 244 f 


conserved hypothetical protein 1 


i ORF01852 I 


SAG1659 


I 118 


iojap-related protein I 


ORF01853 


SAG1660 


173 i 


isochorismatase family protein | 


! ORF01854 i 


SAG1661 | 


I 195 I 


conserved hypothetical protein TIGR00488 I 


ORF01855 S 


SAG1662 


210 


conserved hypothetical protein TIGR00482 


ORF01856 


SAG1663 


! 105 


conserved hypothetical protein TIGR00253 | 


S ORF01857 


SAG1664 


I 372 j 


i GTP-binding protein _J 


ORF01858 


SAG1665 


177 


j hydrolase, hatoacid dehalogenase-like family j 


ORF01859 j 


SAG1666 


I 295 


membrane protein j 


ORF01860 | 


I SAG1667 


j 480 


glutamyl-tRNA(Gln) amidotransferase, B subunit j 


ORF01861 


| SAG1668 


488 I 


1 glutamy!4RNA(Gln) amidotransferase, A subunit 


ORF01862 


I SAG1669 


| 100 j 


1 glutamyURNA(Gln) amidotransferase, C subunit j 


ORF01863 


| SAG1670 


I 881 


pyruvate phosphate dlkinase 


ORF01864 


I SAG1671 


276 


conserved hypothetical protein j 


ORF01865 


SAG1672 


170 


1 CBS domain protein f 


ORF01866 


i SAG1673 


377 


1 3-hydroxyacyl-CoA dehydrogenase family protein 


| ORF01867 


| SAG1674 


182 


1 isochorismatase family protein 


ORF01869 


l SAG1675 


261 


j transcriptional regulator CodY, putative J 


i ORF01870 


| SAG1676 


403 


I aminotransferase, class 1 I 


ORF01871 


I SAG1677 


! 137 


| universal stress protein family FKAMksrat- 1 | 


I ORF01872 


SAG1678 


i 460 


1 hydrolase, haloacid dehatogenase-like family 1 


ORF01873 


SAG1679 


I 320 


1 asparaginase family protein 


j ORF01874 


' SAG1680 


292 


1 shikimate 5-dehydrogenase } 


1 Ur\rU ioiw 


I SAG 1681 


| 304 


1 oxidoreductase, aldo/keto reductase family | 


ORF01876 


\ SAG1682 


671 


j ATP«<lependent DNA helicase RecG J 


ORF01877 


3 SAG1683 


! 512 


1 immunogenic secreted protein, putative 1 


[ ORF01878 


SAG1684 


366 


jalanin racemase 1 


ORF01879 


SAG1685 


I 119 


J holo-(acyl-carrier-protein) synthase i 


j ORF01880 


SAG1686 


S 335 


1 phospho-2-dehydro-3-deoxyheptonate aldolase | 


ORF01881 


SAG1687 


I 842 


j preprotein translocase, SecA subunit 


ORF01882 


SAG1688 


315 


1 mannose-6-phosphate isomerase, class 1 1 


| ORF01883 


| SAG1689 


| 293 


j fructokinase , | 
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Table 32: Conv rsi n of ORF Ref Nos. with SAG Ref Nos. 



ORF Ref No. 


SAGxxxx Ref N . I 


aa |A 


jinotati n 


ORF01885 


SAG1690 


639 1 


*TS system IIABC components 


ORF01886 


SAG1691 | 


479 5 


jucrose-6-phosphate hydrolase 


ORF01887 


SAG1692 I 


320 h 


sucrose operon repressor ScrR i 


ORF01888 


SAG1693 


144 I 


si utilization substance protein B 


ORF01889 


SAG1694 


129 


conserved hypothetical protein 


ORF01890 


SAG1695 


186 


translation elongation factor P 


ORF01892 


SAG1696 


38 


hypothetical protein 


ORF01893 


SAG1697 


48 


hypothetical protein 


ORF01894 j 


SAG1698 ( 


99 j 


conserved hypothetical protein 


ORF01895 


SAG1699 . 


30 


hypothetical protein 


ORF0189B 


SAG1700 


76 | 


hypothetical protein 


ORF01897 


SAG1701 


56 I 


hypothetical protein ! 


ORF01898 


SAG1702 


41 


hypothetical protein 


ORF01899 


SAG1703 


54 j 


hypothetical protein 


ORF01900 


SAG1704 


150 | 


cytidine/deoxycytidylate deaminase family protein 


ORF01902 


SAG1705 


| peptidase, M24 family POINT MUTATION 


ORF01903 


SAG1706 


238 | 


conserved hypothetical protein 


ORF01904 


SAG1707 


499 i 


drug resistance transporter, EmrB/QacA family 


ORF01905 


SAG1708 | 


38 


hypothetical protein 


ORF01906 


SAG1709 I 


942 


excinuclease ABC, A subunit 


ORF01907 


SAG1710 


223 l 


conserved hypothetical protein 


ORF01908 


SAG1711 S 


314 


magnesium transporter, CorA family 


ORF01909 


SAG1712 


79 


ribosomal protein S18 


ORF01910 


SAG1713 


163 


single-strand binding protein 


ORF01911 


SAG1714 I 


95 1 


ribosomal protein S6 


ORF01912 


SAG1715 


374 


A/G-specific adenine glycosylase 


ORF01913 


SAG1716 


197 


transcriptional regulator, Cro/CI family j 


ORF01914 


SAG1717 


104 


thioredoxin 


ORF01915 


SAG1718 


166 


PAP2 family protein 


ORF01916 


SAG1719 


| 779 


MutS2 family protein U 


ORF01917 


SAG1720 


180 


conserved hypothetical protein 


ORF01918 


SAG1721 


I 103 


| conserved hypothetical protein 


ORF01919 


SAG1722 


j 297 


ribonuclease Hill 


ORF01920 


SAG1723 


197 


I signal peptidase 1 


ORF01921 


SAG1724 


806 


helicase, putative 


ORF01922 


SAG1725 


160 


1 conserved hypothetical protein 


ORF01923 


SAG1726 


| 364 


| DNA-damage inducible protein P 


ORF01924 


SAG1727 


I 770 


1 formate acetyltransferase 


ORF01925 


SAG1728 


124 


FMN-binding protein 


ORF01926 


SAG1729 


| 309 


1 conserved hypothetical protein 


ORF01927 


SAG1730 


251 


1 proteinase, putative, degenerate, FRAMESHIFT 


ORF01928 


SAG1731 


298 


1 membrane protein, putative 


ORF01929 


SAG1732 


l 282 


1 glycerol uptake facilitator protein, putative 


ORF01930 


SAG1733 


150 


1 universal stress protein family 


ORF01931 


SAG1734 


{ 400 


| transporter, putative _j 


ORF01932 


SAG1735 


I 219 


1 transcriptional regulator, Crp/Fnr family 


ORF01933 


SAG1736 


I 761 


X-pro dipeptidyl-peptidase 


ORF01934 


SAG1737 


119 


j hypothetical protein 


ORF01936 


SAG1738 


326 


1 polyprenyl synthetase family protein 


ORF01937 


SAG1739 


582 


1 ABC transporter, ATP-binding protein CydC 


ORF01938 


SAG1740 


572 


1 ABC transporter, ATP-binding protein CydD 


ORF01939 


| SAG1741 


I 339 


1 cytochrome d ublquinol oxidase, subunit II 


ORF01940 


SAG1742 


475 


| cytochrome d oxidase, subunit 1 


ORF01941 


SAG1743 


402 


pyridine nucleotide-disulphide oxidoreductase family 
1 protein 



34 



Tabl 32: Conv rsion of ORP R f N s. with SAG Ref Nos. 



1 ORFRefNo. \ 


SAGxxxx Ref No. | 


aa lAnnotati n 


ORF01942 ! 


SAG1744 


299 I prenyltransferase. UbiA family 


I ORF01943 


SAG1745 


148 I 


hypothetical protein 


ORF01944 I 


SAG1746 


35 


hypothetical protein 


ORF01945 


SAG1747 


99 


unserved hypothetical protein TIGR00103 


! ORF01946 


SAG1748 


396 I 


zyclopropane-fatty-acyl-phospholipid synthase 


ORF01947 


SAG1749 j 


241 I 


transcriptional regulator, merR family 


ORF01948 i 


SAG1750 I 


195 I 


exonuclease 


ORF01949 


SAG1751 


178 l 


conserved hypothetical protein 


ORF01950 


SAG1752 


375 | 


conserved hypothetical protein TIGR00275 


I ORF01951 II 


SAG1753 \ 


260 l 


conserved hypothetical protein 


I ORF01952 


SAG1754 


89 | 


ribosomal protein S14 


ORF01953 f 


SAG1755 


38 


hypothetical protein 


ORF01954 | 


SAG1756 


341 


conserved hypothetical protein 


| ORF01957 | 


SAG1757 * 


336 I 


O-siatoglycoprotein endopeptidase family protein 


| ORF01958 


SAG1758 I 


135 I 


ribosomal-protein-alanine acetyltransferase, putative 


ORF01960 


SAG1759 I 


230 j 


glycoprotease family protein, putative 


ORF01961 | 


SAG1760 


76 f 


conserved hypothetical protein 


I ORF01962 - 


SAG1761 | 


559 


metailo-beta-lactamase superfamily protein 


I ORF01963 i 


SAG1762 ! 


169 I 


conserved hypothetical protein 


ORF01964 


SAG1763 j 


448 


glutamine synthetase, type 1 


ORF01965 


SAG1764 l 


123 ; 


transcriptional regulator GlnR 


ORF01967 


SAG1765 


179 I 


conserved hypothetical protein 


ORF01969 


SAG1766 


398 I 


phosphogtycerate kinase 


i ORF01970 


SAG1767 


289 


acid phosphatase J 


ORF01971 


t SAG1768 j 


336 


glyceraldehyde 3-phosphate dehydrogenase 


ORF01972 


| SAG1769 


692 


translation elongation factor G 


< ORF01973 


| SAG1770 


156 


ribosomal protein S7 


ORF01974 


| SAG1771 


, 137 


| ribosomal protein S12 


ORF01975 


SAG1772 


! 270 


I pur operon repressor 


ORF01976 


I SAG1773 


I 313 


| HD domain protein 


ORF01977 


i SAG1774 


424 


conserved hypothetical protein 


ORF01978 


| SAG1775 


i 210 


| conserved hypothetical protein 


ORF01979 


| SAG1776 


220 


| ribulose-phosphate 3-epimerase 


ORF01980 


I SAG1777 


! 290 


I conserved hypothetical protein TIGR00157 


ORF01981 


\ SAG1778 


| 283 


| rRNA (guanine-N1-)-methyltransferase, putative 


j ORF01983 


I SAG1779 


290 


1 dimethyladenosine transferase 


ORF01984 


| SAG1780 


I 163 


| hypothetical protein 


| ORF01985 


SAG1781 


! 186 


1 primase-related protein 


\ ORF01987 


SAG1782 


260 


1 deoxyribonuclease, TatD family 


I ORF01988 


j SAG1783 


90 


hypothetical protein 


ORF01989 


j SAG1784 


130 


1 hypothetical protein j 


\ ORF01990 


SAG1785 


430 


1 hypothetical protein 


ORF01991 


SAG1786 


S 130 


1 hypothetical protein 


ORF01992 


I SAG1787 


I 420 


1 ditD protein 


ORF01993 


SAG1788 


I 79 


1 D-atanyl carrier protein 


1 ORF01994 


J §AG17o9 


I 421 


1 dltB protein 


ORF01996 


1 SAG1790 


| 511 


1 D-alanine-activating enzyme 


\ ORF01997 


I SAG1791 


i 395 


1 sensor histidine kinase 


\ ORF01998 


SAG1792 


224 


1 DNA-binding response regulator 


ORF01999 


! SAG1793 


I 44 


1 ribosomal protein L34 


[ ORF02000 


i SAG1794 


I 451 


1 membrane protein, putative 


ORF02001 


SAG1795 


| 388 


1 transposase, IS30 family, putative 


ORF02002 


SAG1798 


I 575 


1 amino acid ABC transporter, permease protein 


I ORF02004 


SAG1797 


I 407 


1 amino add ABC transporter. ATP-binding protein 
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Table 32: Conversi n 



fORFRefN s. with SAG Ref Nos. 



ORFRef No. 



SAGxxxx Ref No. 



aa Ann tation 



ORF02005 



ORF02006 



SAG1798 



39 | hypothetical protein 



SAG1799 S 792 I xylulose-5-phosphate/fmctose«6-phosphate 
| phosphoketolase 



ORF02007 



ORF02008 



SAG1800 



363 | conserved hypothetical protein 



SAG1801 



559 \ transcriptional antiterminator, BgiG family 



ORF02009 



SAG1802 



253 \ conserved hypothetical protein 



ORF02010 



SAG1803 



505 | carbohydrate kinase, FGGY family 



ORF02011 



SAG1804 



329 \ hypothetical protein 



ORF02012 



SAG1805 



483 | PTS system component, putative 



ORF02015 



SAG1806 



318 I glyoxylate reductase, MADH-dependent 



ORF02016 



SAG1807 



339 | hypothetical protein 



ORF02017 



SAG1808 



327 I sugar binding transcriptional regulator, Lad family 



ORF02018 



SAG1809 | 215 | transaldolase family protein 

SAG1810 I 238 \ carbohydrate isomerase, AraD/FucA family 



ORF02019 



ORF02020 



SAG1811 



287 I hexulose-6-phosphate isomerase. putative 



ORF02021 



SAG1812 



221 I hexulose-6-phosphate synthase, putative 



ORF02022 



ORF02023 



ORF02024 



SAG1813 



161 I PTS system. HA component 



SAG1814 



92 | PTS system, IIB component 



SAG1815 



479 I transport protein SgaT, putative 



ORF02025 



SAG1816 



205 I hypothetical protein 



ORF02026 



SAG1817 



157 | hypothetical protein 



ORF02027 



SAG1818 



430 | adenylosuccinate synthetase 



ORF02028 



SAG1B19 



340 I perfringolysin O regulator protein 



ORF02029 



ORF02030 



ORF02031 



SAG1820 



224 I conserved hypothetical protein 



SAG1821 



750 | glutamate-cysteine ligase-related protein 



SAG1822 



"272 | conserved hypothetical protein 



ORF02032 



ORF02033 



ORF02034 



SAG1823 



41 6 | conserved hypothetical protein 



SAG1824 



291 | chaperonin. 33 kDa 



SAG1825 



325 I NifR3/Smm1 family protein 



ORF02035 



SAG1826 



21 3 | deoxynucleoside kinase family protein 



ORF02036 



SAG1827 | 163 | phosphinothricln N-acetyltransferase 
SAG1828 J 815 | ATP-dependent Clp protease, ATP-binding subunit 



ORF02037 



ORF02038 



ORF02039 



SAG1829 



154 I transcriptional regulator CtsR 



SAG1830 



"153 I conserved hypothetical protein 



ORF02040 



SAG1831 



"346 I translation elongation factor Ts 



ORF02041 



ORF02042 



ORF02043 



SAG1832 



256 I ribosomal protein S2 



SAG1833 



186 I alkyl hydroperoxide reductase, subunit C 



SAG1834 



510 I alkyl hydroperoxide reductase, subunit F 



ORF02044 



ORF02045 



ORF02046 



ORF02047 



ORF02048 



SAG1835 



"134 1 conserved hypothetical protein 



SAG1836 



"61 1 conserved hypothetical protein 



SAG1837 



468 | lysin, putative 



SAG1838 



109 | hollnTputatlve 



SAG1839 



136 | conserved hypothetical protein 



ORF02049 



ORF02050 



SAG1840 



112 I hypothetical protein 



SAG1841 



"76 \ conserved domain protein 



ORF02051 



ORF02053 



SAG1842 



1224 | PblB, putative 



SAG1843 



"240 | conserved hypothetical protein 



ORF02056 



SAG1844 



"911 | conserved hypothetical protein 



ORF02057 



SAG1845 



42 | hypothetical protein 



ORF02058 



SAG1846 



158 I hypothetical protein 



ORF02059 



SAG1847 



227 I conserved hypothetical protein 



ORF02060 



ORF02061 



SAG1848 



"114 | conserved hypothetical protein 



SAG1849 



115 | hypothetical protein 
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Table 32: C nv rsionofORFR f Nos. with SAG R fNos. 



1 ORFRef No. \ * 


5 AGxxxx Ref N . 


aa A 


nnotation tt I 


ORF02062 1 


SAG1850 


101 


hypothetical protein I 


ORF02063 


SAG1851 


111 c 


unserved domain protein | 


ORF02064 


SAG1852 f 


420 c 


unserved domain protein | 


ORF02066 i, 


SAG1853 


180 ( 


>rotease, putative I 


ORF02067 


SAG1854 f 


380 < 


unserved hypothetical protein | 


1 ORF02068 


SAG1855 


570 1 


erminase large subunit, putative _J| 


J ORF02069 i 


SAG1856 


161 


hypothetical protein I 


ORF02070 


SAG1858 


95 


hypothetical protein I 


| ORF02071 


SAG1859 


180 


site-specific recombinase, phage integrase family I 


ORF02072 


SAG1860 


154 


conserved hypothetical protein j 


ORF02073 


SAG1861 i 


119 


transcriptional regulator. Cro/CI famny 


ORF02075 | 


SAG1862 i 


86 


hypothetical protein I 


ORF02076 } 


SAG1863 


138 


single-strand binding protein I 


ORF02077 


SAG1864 


68 


hypothetical protein , , I 


l ORF02078 


SAG1865 


74 


conserved hypothetical protein j 


l ORF02079 


SAG1866 I 


109 


conserved hypothetical protein I 


| ORF02080 


SAG1867 | 


163 


conserved hypothetical protein I 


I ORF02081 I 


SAG1868 S 


134 


hypothetical protein I 


I ORF02082 

I I 


SAG1869 


437 


type II DNA modification methyltransferase, putative j 
l 


ORF02083 | 


SAG1870 


273 


DNA replication protein DnaC, putative 1 


ORF02084 


SAG1871 


248 


conserved hypothetical protein 1 


ORF02085 


SAG1872 


200 


hypothetical protein 1 


I ORF02086 


SAG1873 


443 


replicative DNA helicase 1 


ORFQ2087 I 


SAG1874 


I 87 


hypothetical protein 1 


I ORF02088 


SAG1875 


94 


conserved hypothetical protein | 


ORF02089 


! SAG1876 


l 176 


HNH endonuclease family protein 1 


l ORF02090 


I SAG1877 


| 236 


antirepressor protein, putative j 


l ORF02091 


I SAG1878 


! 102 


conserved domain protein 


I ORF02092 


i SAG1879 


! 156 


hypothetical protein 1 


I ORF02093 


| SAG1880 


54 


hypothetical protein 1 


I ORF02094 


SAG1881 


51 


hypothetical protein j 


ORF02095 


SAG1882 


I 120 


repressor protein, putative I 


I ORF02097 


SAG1884 


134 


hypothetical Drotein j 


ORF02098 

I 


SAG1885 


I 356 


site-specific recombinase, phage integrase family I 


j ORF02100 


| SAG1886 


I 32 




j ORF02101 


i SAG1887 


I 689 


Na+/H+ excnanger Tamny pruiem i 


ORF02102 


SAG1888 


\ 78 




I ORF02103 


SAG1889 


| 317 


miCrOCin iiiiiiiuiiuy piuiciu , puwuww ■ 


ORF02104 


| SAG1890 




enoopeptiaase w i 


I ORF02105 


SAG1891 


| 327 


nviHnrartiir^tac^ f^fn/lrih/MneA familv 1 


j ORF02107 


| SAG1892 


S OCQ 


rnemorfiiiiw ijiuusih, pmauvo ■ 


I ORF02108 


1 SAG1893 


I oy 




I ORF02109 


1 SAG1894 


I HA 




I ORF02110 


| SAG1895 


204 


polypeptide deformylase j 


< ORF02111 


| SAG1896 


333 


sugar binding transcriptional regulator RegR 1 


ORF02112 


| SAG1897 


634 


conserved hypothetical protein 1 


ORF02113 


i SAG1898 


271 


PTS system, IID component 1 


ORF02114 


j SAG1899 


288 


PTS system. 110 component 


; ORF02115 


I SAG1900 


| 164 


PTS system, IIB component 1 


\ ORF02116 


SAG1901 


398 


glucuronyl hydrolase 1 


] ORF02118 


| SAG1902 


144 


PTS system, IIA component 1 


I ORF02119 


S SAG1903 


I 34 


hypothetical protein 1 
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Tabl 




32: Conversion of ORF R f N s. with SAG Ref 



ORF Ref N 



SAGxxxx Ref No. 



aa 



Annotati n __ 

1 oxidoreductase, short-chain dehydrogenase/reductase 



ORF02120 



ORF02121 



ORF02122 



SAG1904 



270 



Ifamfly 



SAG 1905 



212 | conserved hypothetical protein 



SAG1906 



335 I carbohydrate kinase, PfKB family 



ORF02123 



SAG1907 | 212 1 2-dehydro-3-deoxyphosphogluconate aldolase/4- 
|hydroxy-2-oxog1utarate aldolase 



ORF02124 



SAG1908 



499 I hypothetical protein 



ORF02125 



SAG1909 



204 | nitroreductase family protein 



ORF02126 



SAG1910 | 141 { transcriptional regulator. MarR family 

SAG1911 I 1468 I DNA polymerase III, alpha subunit, Gram-positive type 



ORF02127 



ORF02128 



SAG1912 



194 | N-acetylmuramoyl-L-alanine amidase. family 4 protein 



ORF02129 



SAG1913 | 617 | prolyMRNA synthetase 

SAG 191 4 I 419 I membrane-associated zinc metalloprotease, putative 



ORF02130 



ORF02131 



SAG1915 



264 | phosphatidate cytldylyltransferase 



ORF02132 



SAG1916 



250 I undecaprenyl diphosphate synthase 



ORF02133 



SAG1917 



113 I preprotein translocase, YajC subunit 



ORF02134 



ORF02135 



SAG1918 



1 14 | conserved hypothetical protein 



SAG1919 



387 | malate oxidoreductase 



ORF02136 



SAG1920 



445 I citrate carrier protein, CCS family 



ORF02137 



SAG1921 



508 l sensor histidine kinase family protein 



ORF02138 



ORF02139 



SAG1922 



229 | response regulator 



SAG1923 



331 lUDP-glucose 4-epimerase 



ORF02140 



SAG1924 



535 I glucan 1 ,6-alpha-glucosidase 



ORF02141 



SAG1925 



377 t sugar ABC transporter, ATP-binding protein 



ORF02142 



SAG1926 



283 I helix-turn-helix domain protein, fis-type 



ORF02143 



SAG1927 



298 | lacX protein 



ORF02144 



ORF02145 



SAG1928 



SAG1929 



325 
310 



I tagatose 1,6-diphosphate aldolase 



| tagatose-6-phosphate kinase 



ORF02146 



ORF02147 



ORF02148 



SAG1930 



171 I galactose-6-phosphate isomerase, LacB subunit 



5AG1931 



141 I galactose-6-phosphate isomerase, LacA subunit 



SAG1932 



816 I neuraminidase 



ORF02149 



ORF02150 



ORF02152 



SAG1933 



482 | PTS system, IIC component, putative 



SAG1934 



101 | PTS system, IIB component, putative 



SAG1935 



157 I PTS system, HA component, putative 



ORF02153 



SAG1936 



258 | lactose phosphotransferase system repressor 



ORF02156 



ORF02157 



ORF02158 



SAG1937 



streptococcal histidine triad family protein, degenerate, 

Iframeshift 



SAG1938 



307 | adhesion lipoprotein, putative 



SAG1939 



147 I conserved hypothetical protein T1GR00256 



ORF02159 



SAG1940 



738 | GTP pyrophosphokinase 



ORF02160 



SAG1941 



800 I 2 ,3 -cyclionucleotide 2 % -phosphodiesterase 



ORF02161 



SAG1942 



151 | nrdl protein, putative 



ORF02162 



ORF02163 



SAG1943 



345 | conserved hypothetical protein 



SAG1944 



165 l conserved hypothetical protein 



ORF02164 



ORF02165 



SAG1945 



345 l iron ABC transporter, iron-binding protein 



SAG1946 



257 | DNA-blnding response regulator 



ORF02166 



SAG1947 



549 | conserved hypothetical protein 



ORF02167 



ORF02168 



ORF02169 



SAG1948 



275 | PTS system, HP component 



SAG1949 



SAG1950 



269 
163 



I PTS system, IIC component 



| PTS system, IIB component 



ORF02170 



SAG1951 



"141 | PTS system, IIA component, putative 



ORF02171 



SAG1952 



353 | membrane protein, putative 



ORF02172 



SAG1953 



60 | hypothetical pr tein 
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Tabl 32: Conv rsion of ORF Ref N s. with SAG Ref Nos. 



ORF Ref No. I SAGxxxx Ref No. I aa {Ann tation 



ORF02173 



ORF02174 



SAG1954 



384 I hypothetical protein 



SAG1955 



282 I ABC transporter, ATP-blnding protein 



ORF02175 



SAG1Q56 



~96 | conserved domain protein 



ORF02176 



SAG1957 



250 I response regulator 



ORF02177 



ORF02178 



ORF02179 



"SAG1958 \ 276 I consented hypothetical protein 
SAG1959 I 727 l PTS system, IIABC components 



SAG1960 



551 \ sensor histidine kinase 



ORF02180 



SAG1961 



225 l phosphate regulon response regulator PhoB 



ORF02181 



ORF02182 
ORF02183 



15AG1962 1 218 j phosphate transport system regulatory protein PhoU. 
[putative 



SAG1963 253 \ phosphate ABC transporter, ATP-binding protein 



SAG1964 



292 I phosphate ABC transporter, permease protein 



phosphate ABC transporter, permease protein 
hemolysin precursor, putative 



ORF02184 



ORF02186 



SAG1965 



SAG1966 



281 
293 



ORF02187 



SAG1967 | 195 l hypothetical protein 
SAG1968 l 246 l cons erved hypothetical protein TIGR00046 



ORF02188 



ORF02189 



SAG1969 



317 I rlbosomal protein L11 methyltransferase 



ORF02190 



ORF02191 



ORF02192 



ORF02194 



ORF02195 



ORF02196 



ORF02197 



SAG1970 



102 I conserved hypothetical protein 



SAG1971 



41 | hypothetical protein 



SAG1972 



238 I transcriptional regulator, MerR family 



SAG1973 



1 56 I acetyltransferase, GNAT family 



SAG1974 



152 | MutT/nudbc family protein 



SAG1975 | 47 I hypothetical protein _ 

SAG1976 I 156 I cons erved hypothetical protein 



ORF02198 



ORF02199 



ORF02201 



SAG1977 



163 I acetyltransferase, GNAT family 



SAG1978 



422 I ATPase, AAA family 



SAG1979 



253 | hypothetical protein 



ORF02202 



SAG1980 



300 I ABC transporter, ATP-binding protein 



ORF02203 



SAG1981 



68 | hypothetical protein 



ORF02205 



SAG1982 I 359 I transcriptional regulator, Cro/CI family 



ORF02206 



ORF02207 



ORF02208 



SAG1983 



105 1 conserved hypothetical protein 



SAG1984 



188 I conserved hypothetical protein TIGR00730 



SAG1985 



51 I hypothetical protein 



ORF02209 



ORF02210 



SAG1986 



375 j integrase, phage family, putative 



SAG1987 



"§i j conserved hypothetical protein 



ORF02211 



SAG1988 



342 I conserved hypothetical protein 



ORF02212 



SAG1989 



139 | hypothetical protein 



ORF02213 



ORF02214 



ORF02215 



SAG1990 



127 | hypothetical protein 



SAG199 1 I 204 | transcriptional regulator, Cro/CI family 
SAG1992 j 513 l conserved hypothetical protein 



ORF02216 



SAG1993 



373 j site-specific recombinase, phage integrase family 



ORF02217 



ORF02219 



ORF02221 



ORF02223 



ORF02224 



SAG1994 



108 \ conserved hypothetical protein 



SAG1995 



210 I hypothetical protein 



SAG1996 



263 l cell wall anchor protein-related protein 



SAG1997 



182 I hypothetical protein 



SAG1998 



457 I hypothetical protein 



ORF02225 



ORF02226 



ORF02227 



ORF02228 



ORF02229 



ORF02230 



SAG1999 



47 1 hypothetical protein 



SAG2000 



666 \ membrane protein, putative 



SAG2001 



756 I conjugal transfer protein, interruption-C 



SAG2002 



129 I IS1381 , transposase OrfB 



SAG2003 



127 l IS1381, transposase OrfA 



SAG2005 



1 36 I conserved hypothetical protein 



ORF02231 



SAG2006 



88 | conserved hypothetical protein 



ORF02232 



SAG2007 



317 I conserved hypothetical protein 
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Tabl 32: Conv rsion fORFR f Nos. with SAG R fNos. 



♦ 



ORFR f No. 



ORF02233 



SAGxxxx Reff No. \ aa I Annotation 



SAG2008 



84 1 conserved hypothetical protein 



ORF02234 



SAG2009 



"88 | conserved hypothetical protein 



ORF02235 



SAG2010 



157 | hypothetical protein 



ORF02236 



SAG2011 



160 | conservedhypothetical protein 



ORF02237 



SAG2012 



90 | hypothetical protein 



ORF02238 



ORF02239 



SAG2013 



189 I hypothetical protein 



SAG2014 



449 | hypothetical protein 



ORF02240 



SAG2015 



99 | transcriptional regulator, Cro/CI family 



ORF02241 



SAG2016 



125 | hypothetical protein 



ORF02242 



SAG2017 



429 I transcriptional regulator. Cro/CI family 



ORF02243 



SAG2018 



553 | FtsK/SpolllE family protein 



ORF02244 



SAG2019 



153 | hypothetical protein 



ORF02245 



ORF02246 



SAG2020 



98 I hypothetical protein 



SAG2021 



826 I cell wall surface anchor family protein 



ORF02247 



ORF02249 



SAG2022 



417 | transposase, 1SL3 family 



SAG2023 



546 I mercuric reductase 



ORF02250 



SAG2024 | 130 I mercuric resistance operon regulatory protein MerR 



ORF02251 



ORF02252 



SAG2025 



522 l Mn2+/Fe2+ transporter, NRAMP family 



SAG2026 



240 l membrane protein, putative 



ORF02253 



SAG2027 



205 I ABC transporter, ATP-binding protein 



ORF02254 



ORF02255 



ORF02257 



SAG2028 



SAG2029 



36 
284 



I conserved hypothetical protein 



j streptomycin resistance protein 



SAG2030 



130 I hypothetical protein 



ORF02258 



SAG2031 



202 I hypothetical protein 



ORF02259 



ORF02260 



SAG2032 



Til I conserved hypothetical protein 



SAG2033 



162 | acetyltransferase, GNAT family 



ORF02261 



SAG2034 



247 I membrane protein, putative 



ORF02262 



SAG2035 



300 | ABC transporter, ATP-binding protein 



ORF02263 



SAG2036 



68 | hypothetical protein 



ORF02264 



ORF02265 



ORF02266 



ORF02267 



ORF02268 



ORF02269 



ORF02270 



SAG2037 



"358 I transcriptional regulator, Cro/CI family 



SAG2038 



204 I PAP2 family protein 



SAG2039 



"98 I conserved hypothetical protein 



SAG2040 



186 I conserved hypothetical protein TIGR00730 



SAG2041 



287 I protease, putative 



SAG2042 



100 | rhodanese family protein 



SAG2043 



255 IcAMP factor 



ORF02271 



ORF02272 



SAG2044 



62 I hypothetical protein 



SAG2045 



179 j DNA topology modulation protein FlaR. putative 



ORF02273 



ORF02274 



SAG2046 



361 | glycerol dehydrogenase, putative 



SAG2047 



"235 I conserved hypothetical protein 



ORF02275 



SAG2048 



614 



5-methyltetrahydrofolate-homocysteine 
methyltransferase, putative 



ORF02276 



ORF02277 



SAG2049 



745 



5-methyitetrahydropteroyltriglutamate-homocysteine 
methyltransferase 



SAG2050 



107 



conserved hypothetical protein 



ORF02278 



SAG2051 



230 



branched-chain amino acid transport protein AzIC, 
putative ______ — — — — — — 



ORF02279 



ORF02280 



SAG2052 



41 



hypothetical protein 



SAG2053 



1570 



serine protease, subtilase family, putative 



ORF02281 



ORF02282 



ORF02283 



SAG2054 



228 



DNA-binding response regulator 



SAG2055 



462 



sensor hlstidine kinase 



SAG2056 



202 



chromosome assembly-relat d protein 



ORF02285 



SAG2057 



833 



leucvl-tRNA synthetase 



ORF02286 



SAG2058 



415 



major facilitator family protein 
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Table 32: C nversi n f ORF R f N s. with SAG R f fibs. 



ORF R f No. 


SAGxxxx Ref No. 


aa / 


uw tati n 


ORF02287 


SAG2059 


281 i 


conserved hypothetical protein | 


ORF02288 ! 


SAG2060 


398 i 


glycosyl transferase, family 8 


| ORF02289 


SAG2061 


401 


glycosyl transferase, family 8 


ORF02290 


SAG2062 


179 


transcription antitermlnatlon protein NusG 


ORF02291 


SAG2063 


630 


pathogenicity protein, putative 


ORF02292 


SAG2064 


57 


preprotein translocase, SecE subunit, putative 


ORF02293 


SAG2066 


773 


penicillin-binding protein 2A 


ORF02294 


SAG2067 


294 


rlbosomal large subunit pseudouridine synthase, RluD 
subfamily 


ORF02295 


SAG2068 


546 


Lyme disease proteins of unknown function, putative 


ORF02296 


SAG2069 


403 


phosphopentomutase 


ORF02297 


SAG2070 


223 


deoxyribose-phosphate aldolase 


ORF02298 


SAG2071 


400 


Na+ dependent nucleoside transporter 


ORF02300 


SAG2072 


259 


uridine phosphorylase 


ORF02301. 


SAG2073 


245 


transcriptional regulator, GntR family 


ORF02302 


SAG2074 


540 


60 kda chaperonin 


ORF02303 


SAG2075 


94 


chaperonin, 10 kDa 


ORF02305 


SAG2076 


267 | 


ABC transporter, ATP-blnding protein \ 


ORF02308 


SAG2077 


298 


ABC transporter, permease protein 


ORF02307 


SAG2078 


320 


lipoprotein, putative 


ORF02308 


SAG2079 


265 


hydrolase, haloacid dehalogenase-like family [ 


ORF02309 


SAG2080 


286 


glyoxalase family protein 


ORF02310 


SAG2081 


243 


conserved hypothetical protein 


ORF02311 


SAG2082 


205 


anaerobic ribonucleoside-triphosphate reductase 
activating protein 


ORF02312 


SAG2083 


163 


acetyltransferase, GNAT family 


ORF02313 


SAG2084 


310 


virulence factor MviM, putative 


ORF02314 


SAG2085 


47 


conserved hypothetical protein 


ORF02315 


SAG2086 


723 


anaerobic ribonucleoside-triphosphate reductase 


ORF02316 


SAG2087 


495 


conserved hypothetical protein 


ORF02317 


SAG2088 


40 


hypothetical protein 


ORF02318 


SAG2089 


105 


conserved hypothetical protein 


ORF02319 


SAG2090 


136 


conserved hypothetical protein TIGR00250 1 


ORF02320 


SAG2091 


88 


conserved hypothetical protein 


ORF02321 


SAG2092 


132 


conserved hypothetical protein 


ORF02322 


SAG2093 


379 


recA protein 1 


ORF02323 


SAG2094 




competence/damage-inducible protein CinA 

FRAMESHIFT 


ORF02325 


SAG2095 


183 


DNA-o-metnyiaaenine giycosyiase i 


ORF02327 


SAG2096 


196 


Holuday junction una neucase kuva 


ORF02328 


SAG2097 


i 418 


transporter, putative 


ORF02329 


SAG2098 


659 


una mismaicn repair proiein nexo 


ORF02330 


SAG2099 


33 


nypotneucai proiein 


ORF02331 


SAG2100 


67 


coia snocK proiein, vou Tamiiy 


wr\r Utoot 


SAG2101 


858 


DNA mismatch repair protein HexA 


ORF02333 


SAG2102 


145 


arginine repressor ArgR, putative 


ORF02334 


SAG2103 


563 


arginyl-tRNA synthetase 


ORF02335 


SAG2104 


102 


conserved hypothetical protein 


ORF02337 


SAG2105 


290 


conserved hypothetical protein 


ORF02338 


SAG2106 


314 


conserved hypothetical protein 


ORF02339 


| SAG2107 


! 583 


aspartyl-tRNA synthetase 


ORF02340 


SAG2108 


426 


histidyl-tRNA synth tase 


ORF02341 


SAG2109 


60 


ribosomal protein L32 
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Table 32: Conversion of ORF R f Nos. with SAG Ref Ros. 



1 ORF Ref No. 1 I 


SAGxxxx Ref No. J 


aa A 


nn tation I 


ORF02342 


SAG2110 I 


49 r 


bosomai protein i_^o i 


ORF02343 


SAG2111 I 


173 c 


unserved hypothetical protein J 


ORF02344 1 
1 1 


SAG2112 


494 s 


,ite-specific recombinase, phage integrase family 


1 ORF02345 1 


SAG2113 j- 


82 ( 


unserved hypothetical protein | 


1 ORF02346 1 


SAG2114 I 


342 c 


unserved hypothetical protein _J 


1 ORF02347 


SAG21 1 5 $ 


143 


hypothetical protein I 


ORF02349 1 


SAG2116 | 


151 < 


conserved hypothetical protein I 


1 ORF02350 


SAG2117 I 


71 I 


hypothetical protein _j 


ORF02351 


SAG2118 I 


306 | transcriptional regulator, Cro/CI family I 


I ORF02352 


SAG2119 I 


373 j 


conserved domain protein 


ORF02355 


SAG2120 | 


56 


hypothetical protein 


ORF02356 I 


SAG2121 | 


176 


hypothetical protein I 


ORF02357 j 


SAG2122 


223 


DNA-binding response regulator j 


ORF02358 


SAG2123 


454 


sensor histidine kinase , , I 


ORF02359 j 


SAG2124 j 


517 


membrane protein, putative I 


ORF02360 S 


SAG2125 


308 


carbamate kinase I 


| ORF02361 If 


SAG2126 I 


332 


ornithine carbamoyltransferase J 


\ ORF02362 \ 


SAG2127 J 


431 l 


sensor histidine kinase 


ORF02363 


SAG2128 | 


277 J 


response regulator I 


ORF02364 i 


SAG2129 I 


240 


amino acid ABC transporter, ATP-binding protein s 


j ORF02365 
I 


SAG2130 




binding protein 


ORF02367 


SAG2131 


847 j 


membrane protein, putative I 


I ORF02368 


SAG2132 | 


247 


conserved hypothetical protein I 


ORF02369 


| SAG2133 


118 j 


conserved hypothetical protein J 


j ORF02370 


| SAG2134 


772 


membrane protein, putative 


I ORF02371 


SAG2135 


179 


transcriptional regulator, TetR family, putative j 


I ORF02372 


| SAG2136 


| 98 


conserved hypothetical protein I 


I ORF02373 


f SAG2137 


203 


j ribosomal protein S4 I 


I ORF02374 


SAG2138 


I 95 


conserved hypothetical protein j 


ORF02375 


1 SAG2139 


| 451 


I replicative DNA heiicase 3 


I ORF02376 


SAG2140 


i 150 


ribosomal protein L9 I 


ORF02377 


I SAG2141 


S 660 


j DHH family protein I 


j ORF02378 


S SAG2142 


I 613 


l glucose inhibited division protein A I 


ORF02379 


| SAG2143 


203 


I conserved hypothetical protein TIGR00427 


| ORF02380 


SAG2144 


373 


I tRNA (5-methylaminomethyl-2-thiouridylate>- I 
I methyltransferase 


ORF02381 


j SAG2145 


222 


I L-serine dehydratase, Iron-sulfur-dependent, beta j 
Isubunit I 


| ORF02382 


SAG2146 


290 


I L-serine dehydratase, iron-sulfur-dependent, alpha | 
Isubunit I 


ORF02383 


I SAG2147 


234 


j conserved hypothetical protein ^ i 


ORF02384 


| SAG2148 


j 179 


LysM domain protein I 


ORF02385 


j SAG2149 


264 


I cobalt transport family protein ! 


ORF02386 


SAG2150 


280 


j ABC transporter, ATP-binding protein j 


ORF02387 


\ SAG2151 


I 279 


ABC transporter, ATP-binding protein, FRAMESHIFT I 


j ORF02388 


I SAG2152 


180 


CDP-diacylglycerol-glycerol-3-phosphate 3- J 
phosphatidy Itransfe rase 


QRF02389 


SAG2153 


427 


peptidase, M1 6 family I 


ORF0239Q 


\ SAG2154 


414 


conserved hypothetical protein I 


ORF02391 


SAG2155 


t 117 


I conserved hypothetical protein _ J 


I ORF02392 


SAG2156 


369 


I recF protein 


l ORF02393 


SAG2157 


278 


transporter, putative I 


I ORF02395 


I SAG2158 


I 220 


| transcriptional regulator, Cro/CI family I 
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Table 32: Conversion of ORF R f Nos. with SAG R fNos. 



1 ORF Ref No. 1 g 


SAGxxxx Ref No. I 


aa (Annotation 


ORF02396 


SAG2159 I 


493 I inosine-S'-monophosphate dehydrogenase 


1 ORF02397 


SAG2160 I 


161 |1 


ranscriptionat regulator, ArgR family 8 


1 ORF02398 


SAG2161 \ 


226 t 


ranscriptional regulator, Crp/Fnr family I 


I ORF02399 j 


SAG2162 I 


234 ( 


conserved hypothetical protein f 


ORF02400 J 


SAG2163 I 


410 s 


arginine deiminase I 


J ORF02401 


SAG2164 


136 « 


acetyltransferase, GNAT family I 


( ORF02402 i 


SAG2165 \ 


337 \i 


ornithine carbamoyltransferase j 


ORF02403 I 


SAG2166 1 


475 I arginine/ornithine antiporter 


ORF02404 j 


SAG2167 


318 | 


carbamate kinase I 


ORF02405 


SAG2168 J 


341 


Iryptophanyl-tRNA synthetase j 


| ORF02406 ; 


SAG2169 


230 


conserved hypothetical protein _ _J 


ORF02407 


SAG2170 


290 | 


conserved hypothetical protein I 


ORF02408 


SAG2171 


539 I 


ABC transporter, ATP-binding protein I 


i ORF02409 


SAG2172 


859 I 


ABC transporter, permease protein, putative | 


I ORF02410 i 


SAG2173 j 


159 | 


conserved hypothetical protein TIGR00246 j 


I ORF02411 \ 


SAG2174 


409 




ORF02412 


SAG2175 


257 ! 


oartitionlna protein. ParB family t 


ORF02413 | 


SAG0001 


453 I 


chromosomal replication initiator protein DnaA I 


ORF02415 I 


SAG0002 | 


378 I 


DNA polymerase III, beta subunit I 


ORF02416 I 
I I 


SAG0003 


293 I 


diacylglycerol kinase catalytic domain protein, putative I 


I QRF02417 I 


SAG0004 j 


65 


conserved hypothetical protein | 


ORF02418 


SAG0005 | 


67 S 


hypothetical protein j 


ORF02419 I 


SAG0006 


371 | 


conserved hypothetical GTP-binding protein j 


ORF02420 ! 


| SAG0007 \ 


I 191 | 


I peptidyl-tRNA hydrolase _ I 


ORF02421 


\ SAG0008 


j 1165 


I transcription-repair coupling factor I 


ORF02422 


SAG0009 


31 ! 


hypothetical protein I 


j ORF02423 | 


8AG0010 | 


90 


I S4 domain protein I 


ORF02424 


SAG0011 


123 


I cell division protein DivIC, putative I 


> ORF02425 


SAG0012 


I 44 


I conserved hypothetical protein I 


\ ORF02426 


SAG0013 


428 


I conserved hypothetical protein I 


ORF02427 


| SAG0014 


I 424 


I MesJ/Yd62 family protein j 


j ORF02428 


I SAG0015 


180 


hypoxanthine-guanine phosphoribosyltransferase | 


I ORF02429 


SAG0016 


658 


I cell division protein FtsH I 


\ ORF03000 


I SAG0157 




I Dnase-related protein, DEGENERATE _J 


i ORF03001 


SAG0579 


I 142 


I conserved hypothetical protein j 


. ] ORF03002 


I SAG0580 


111 


I conserved hypothetical protein, truncation I 


I ORF03003 


1 SAG0652 


f | Tn5252, Orf 28 protein, degenerate 


j ORF03004 


1 SAG0655 


I 57 


| conserved hypothetical protein I 


I ORF03005 


[ SAG0662 


101 


I cylX protein { 


I ORF03006 


_| SAG0917 


I 83 


Tn916, hypothetical protein I 


ORF03007 


| SAG0920 


I 23 


Tn916, hypothetical protein I 


I ORF03008 


i fx a ^Anio 


I 61 


I Tn916, hypothetical protein I 


j ORF03009 


I SAG0924 


| 28 


| Tn916, tetM leader peptide 


{ ORF03010 


% SAG0936 


39 


I Tn916, hypothetical protein I 


ORF03011 


SAG1484 


48 


I ribosomal protein L33 I 


ORF03012 


SAG1857 


119 


I HNH endonuclease family protein I 


ORF03013 


s SAG1883 


128 


| conserved hypothetical protein I 


ORF03014 


SAG2065 


l 50 


| ribosomal protein L33 I 


ORF03015 


SAG2004 


67 


| conjugal transfer protein, interruption-N j 
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Tabl 33: List of GAS ORFs which ar shared with GBS and Spn 



gi|13621326|gb|AAK33146.1| 
gi|1 3621 327|gb|AAK33147.1 1 
gi|1 3621 328|gb|AAK33148.1 1 
gi|1 3621 329|gb|AAK33149.1 1 
gi|1 3621 330[gb|AAK331 50.1 1 
gi|1 3621 331 |gb|AAK331 51 .1 1 
gi|13621332!gb|AAK33152.1| 
gi|1 3621 333|gb|AAK331 53. 1 1 
gi|1 3621 334|gb|AAK331 54. 1 1 
gi|1 3621 335|gb|AAK331 55.1 1 
gi|1 3621 337|gb|AAK331 56. 1 1 
gi|1 3621 340|gb|AAK331 58. 1 1 
gi|1 3621 341 |gb|AAK331 59.1 1 
gi|1 3621 343|gb|AAK331 60.1 1 
gi|1 3621 344|gb|AAK331 61 .1 1 
gi|1 3621 346|gb|AAK331 63.1 1 
gi|1 3621 347|gbIAAK331 64.1 1 
gi| 1 3621 348|gb| AAK331 65. 1 1 
gi|1 3621 349|gb|AAK331 66.1 1 
gi|1 3621 350|gb|AAK331 67.1 1 
gi]1 3621 353|gb|AAK331 69.1 1 
gi|13621354|gb|AAK33170.1| 
gi|13621355|gb|AAK33171.1| 
gi| 1 3621 357|gb|AAK331 73. 1 1 
gi|13621358|gb|AAK33174.1| 
gi|13621359|gb)AAK33175.1| 
gi|1 3621 361 |gb|AAK331 76. 1 1 
gi|1 3621 362|gb|AAK331 77. 1 1 
gi|13621363|gb|AAK33178.1| 
gi|1 3621 364jgb|AAK331 79. 1 1 
gi| 1 3621 365|gb|AAK331 80. 1 1 
gi|1 3621 366|gb|AAK331 81 . 1 1 
gi|1 3621 367|gb|AAK331 82. 1 1 
gi|1 3621 368|gb| AAK331 83. 1 j 
gi|1 3621 369|gb| AAK33 1 84. 1 1 
gi|1 3621 370)gb|AAK331 85.1 1 
gi|1 3621 372|gb|AAK331 86. 1 1 
gi|1 3621 373|gb|AAK331 87.1 1 
gi|1 3621 374|gb|AAK331 88.1 1 
gi|1 3621 375|gb|AAK331 89.1 1 
gi| 1 3621 376|gb|AAK331 90. 1 1 
gi|1 3621 377|gb|AAK331 91 .1 1 
gi|1 3621 378|gb|AAK331 92.1 1 
gl|1 3621 379|gb|AAK331 93.1 1 
gi| 1 3621 380|gb|AAK331 94. 1 1 
gi|13621382|gb|AAK33196.1| 
gi|13621383|gb|AAK33197.1| 
gi|1 3621 384|gb|AAK331 98. 1 1 
gi|13621385|gb|AAK33199.1| 
gi| 1 3621 386|gb|AAK33200. 1 1 
gi|1 3621 387|gb|AAK33201 .1 1 
gi| 1 3621 388|gb|AAK33202. 1 1 
gi| 1 362 1 389|gb| AAK33203 . 1 1 
gi| 1 3621 390|gb|AAK33204. 1 1 
gi| 1 3621 391 |gb|AAK33205. 1 j 
gi| 1 3621 392|gb[AAK33206. 1 1 



gi|1 3621 393|gb|AAK33207.1 1 
gi|13621 394|gb|AAK33208.1 1 
gi| 1 362 1 397|gb| AAK3321 0. 1 1 
gi|1 3621 398|gb|AAK3321 1 .1 1 
gi|13621399|gb|AAK33212.1| 
gi|13621401|gb]AAK33214.1| 
gj|1 3621 403|gb|AAK3321 5.1 1 
gi|1 3621 404|gb|AAK3321 6. 1 ) 
gi|13621405|gb|AAK33217.1| 
gi|1 3621407|gb|AAK3321 8.1 1 
gi|13621408|gb[AAK33219.1| 
gi|13621409|gb|AAK33220.1| 
gi|1 3621 41 3|gb|AAK33224.1 1 
gi|1362141 5|gb[AAK33226.1 1 
gi|13621416|gb|AAK33227.1| 
gi|1 3621 41 8|gb|AAK33229.1 1 
gi|1 362141 9|gb|AAK33230.1 1 
gi|13621424|gb|AAK33234.1| 
gi|1 3621 425|gb|AAK33235.1 1 
gi|1 3621426|gb|AAK33236.1 1 
gi|1 3621 434|gb|AAK33243. 1 1 
gi|1 3621 450|gb|AAK33258.1 1 
gi|13621455|gbIAAK33262.1| 
gi|1 3621456|gb|AAK33263.1 1 
gi|13621457|gb|AAK33264.1| 
gi| 1 362 1 467 |gb|AAK33273. 1 1 
gi|13621468Igb|AAK33274.1 1 
gi|13621469|gb|AAK33275.1| 
gi|13621470|gblAAK33276.1 1 
gi|1 3621471 |gb|AAK33277.1 1 
gi[1 3621472|gb|AAK33278. 1 1 
gi 1 1 362 1 473|gb| AAK33279. 1 1 
gi|1 3621476|gb|AAK33281 .1 1 
gi|13621477Igb|AAK33282.1| 
gi|13621478|gb|AAK33283.1 1 
gi|13621480|gb|AAK33285.1| 
gi|13621481|gb|AAK33266.1| 
gi| 1 3621 491 |gb| AAK33295. 1 1 
gi|13621494|gb|AAK33298.1 1 
gi| 1 3621 496|gb|AAK33299. 1 1 
gi|1 3621 501 |gb|AAK33304.1 1 
gi|1 3621 502|gbiAAK33305.1 1 
gi|1 3621 505|gb|AAK33307. 1 1 
gi|13621506|gb|AAK33308.1| 
gi|1 3621 507|gb|AAK33309.1 1 
gi|1 3621 51 0|gb|AAK3331 2.1 1 
gi|1 3621 51 1 |gb|AAK3331 3. 1 1 
gi|1 3621 51 3|gblAAK3331 5.1 1 
gi|1 3621 51 6|gb|AAK3331 7. 1 1 
gi|1 3621 51 8|gb|AAK3331 9. 1 1 
gi|1 3621 521 |gb|AAK33322. 1 1 
gi| 1 3621 522|gb jAAK33323. 1 1 
gi| 1 3621 523|gb|AAK33324. 1 1 
gi| 1 3621 524Igb|AAK33325. 1 1 
gi|13621525|gb|AAK33326.1| 
gi|1 3621 527|gblAAK33327. 1 1 
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Table 33: List of GAS ORFs which are shared with GBS and Spn 



gi|13621528|gb|AAK33328.1| 
gi|13621529|gb|AAK33329.1| 
gi|13621530|gb|AAK33330.1| 
gi|1 3621 531 |gb|AAK33331 .1 1 
gi|1 3621 532|gb|AAK33332. 1 1 
gi|13621533lgb|AAK33333.1 1 
gi|1 3621 534|gb|AAK33334. 1 1 
gi|13621535|gb|AAK33335.1 1 
gi|13621536|gb|AAK33336.1| 
gi|13621537|gb|AAK33337.1| 
gi|13621539|gb|AAK33338.1 1 
gi| 1 3621 540|gb|AAK33339. 1 1 
gi|13621 541 |gb|AAK33340. 1 1 
gi|13621542|gb|AAK33341.1| 
gi| 1 3621 543|gb|AAK33342. 1 1 
gi|13621544|gb|AAK33343.1| 
gi|13621546|gb|AAK33345.1| 
gi|13621547|gb|AAK33346.1| 
gi|13621548lgblAAK33347.1| 
gi|13621 550|gb|AAK33348.1 1 
gi|1 3621 551 |gblAAK33349.1 1 
gi|13621 552|gb|AAK33350.1 1 
gi|1 3621 553|gb|AAK33351 . 1 1 
gi|13621554|gb|AAK33352.1 1 
gi|1 3621 555|gb|AAK33353. 1 j 
gi|1 3621 557|gblAAK33355. 1 1 
gi| 1 3621 559|gb|AAK33356. 1 1 
gi|1 3621 560|gb|AAK33357.1 1 
gi| 1 3621 561 |gb|AAK33358. 1 1 
gi|13621562|gb|AAK33359.1| 
gi| 1 3621 563|gb|AAK33360. 1 1 
gi|13621 564|gb|AAK33361 .1 1 
gi|1 3621 565|gb|AAK33362.1 1 
gi|1 3621 566|gb|AAK33363. 1 1 
gi|1 3621 567|gb|AAK33364.1 1 
gi|13621569|gbjAAK33365.1| 
gi|1 3621 571 |gb|AAK33367.1 1 
gi| 1 3621 572|gb|AAK33368. 1 1 
gi|1 3621 573|gb|AAK33369. 1 1 
gi| 1 3621 574|gb|AAK33370. 1 1 
gi|13621 575|gb|AAK33371 .1 1 
gi|1 3621 576|gb|AAK33372. 1 1 
gi|13621577|gb|AAK33373.1| 
gl|13621 579|gb|AAK33374. 1 j 
gi|1 3621 581 |gb|AAK33376.1| 
gi|1 3621 582|gb|AAK33377.1| 
gi|13621583|gb|AAK33378.1| 
gi|1 3621 584|gb|AAK33379. 1 1 
gi|1 3621 585|gb| AAK33380. 1 1 
gi|1 3621 586|gb| AAK33381 .1 1 
gij 1 3621 588|gb| AAK33383. 1 j 
gi|1 3621 589|gb| AAK33384. 1 1 
gi|1 3621 590|gb]AAK33385.1 1 
gi|13621 592|gb|AAK33386.1 1 
gi|1 3621 593|gb|AAK33387.1| 
gi|1 3621 594|gb|AAK33388.1| 



gi| 1 3621 595|gb|AAK33389.1 1 
gi|13621596|gb|AAK33390.lj 
gi|13621597|gb|AAK33391.1| 
gi|13621 598|gb|AAK33392.1 1 
gi|13621 599|gb|AAK33393.1| 
gi|1 3621 600|gb|AAK33394.1 1 
gi|13621602|gb|AAK33395.1| 
gi|1 3621 603|gb|AAK33396. 1 1 
gi|13621604|gb|AAK33397.1| 
gi|13621605|gb|AAK33398.1| 
gi|1 3621 606|gb|AAK33399.1| 
gi| 1 3621 607|gblAAK33400.1 1 
gi|1 3621 608lgb|AAK33401 .1 1 
gi[ 1 3621 609|gb|AAK33402.1 1 
gij1 3621 61 1 |gb|AAK33404. 1 1 
gi|13621614|gb|AAK33406.1| 
gi|1 3621 61 5|gb|AAK33407.1 j 
gi|1 3621 61 6|gb|AAK33408. 1 1 
gl|13621617|gbIAAK33409.1| 
gi|1 3621 61 8|gb|AAK33410.1 1 
gi|13621619|gb|AAK33411.1| 
gi|1 3621 620|gb|AAK3341 2.1 1 
gi|1 3621621 |gb|AAK3341 3.1 1 
gi| 1 3621 622|gb| AAK3341 4. 1 1 
gi|1 3621623|gblAAK3341 5.1 1 
gi|13621624|gb|AAK33416.1 1 
gi|13621625|gblAAK33417.1| 
gi|1 3621627|gb|AAK3341 9.1 1 
gi|13621629|gbjAAK33420.1| 
gi| 1 362 1 630|gb|AAK3342 1 . 1 1 
gi|13621631|gb|AAK33422.1| 
gi|1 3621633|gbjAAK33424.1 1 
gi|1 3621634|gbJAAK33425.1 1 
gi|1 3621636|gb|AAK33427.1 1 
gi|13621637|gb|AAK33428.1| 
gi|1 362 1 638|gb|AAK33429. 1 1 
gi|13621640lgb|AAK33430.1| 
gi|13621642|gb|AAK33432.1| 
gi|13621644|gb|AAK33434.1| 
gi| 1 3621 645|gb|AAK33435. 1 1 
gi|1 3621 647|gb|AAK33437. 1 1 
gi|13621648|gb|AAK33438.1| 
gi|1 3621 650|gb|AAK33440.1 j 
gi|13621651|gb|AAK33441.1|' 
gi|1 3621 652|gb|AAK33442. 1 1 
gi|13621657|gb|AAK33446.1| 
gi|1 3621658|gb|AAK33447. 1 1 
gi|13621660|gb|AAK33449.1| 
gi|13621670|gblAAK33458.1l 
gi|13621671|gb|AAK33459.1| 
gi|13621672|gb|AAK33460.1| 
gi|1 3621 675|gb| AAK33462. 1 1 
gi|13621676|gb|AAK33463.1| 
gi|13621678|gb|AAK33465.1| 
gi|1 3621 680|gb|AAK33467. 1 1 
gi|13621681 |gbJAAK33468.1| 
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gi|1 3621 682|gb|AAK33469.1 1 
gi|13621683|gb|AAK33470.1| 
gi|1 3621 684|gb|AAK33471 .1 1 
gi|1 3621 685|gb|AAK33472.1 1 
gi|1 3621 688|gb|AAK33474.1 1 
gi|1 3621689|gb|AAK33475.1 1 
gi|1 3621690|gb|AAK33476.1 1 
gi|13621691 |gb|AAK33477.1 1 
gi|13621692|gb|AAK33478.1| 
gi| 1 3621 693|gb|AAK33479. 1 1 
gi|1 3621 694|gblAAK33480.1 1 
gi|1 3621 695|gb| AAK33481 . 1 ) 
gi|13621697lgb|AAK33483.1| 
gi|13621698|gb|AAK33484.1| 
gi| 1 3621 700|gb|AAK33485. 1 1 
gi| 1 3621 701 |gb|AAK33486. 1 1 
gi|1 3621 702|gb|AAK33487.1 1 
gi| 1 3621 71 4|gb|AAK33498. 1 1 
gi| 1 3621 71 5|gbIAAK33499. 1 1 
gi| 1 3621 71 7lgblAAK33501 . 1 1 
gi|1 3621 71 8[gb|AAK33502.1 1 
gi|1 3621 71 9|gb|AAK33503.1 j 
gil13621 720|gb|AAK33504.1 1 
gi|13621 726|gb|AAK33509.1 1 
gi|1 3621727|gb|AAK3351 0.1 1 
gi|1 3621 729|gb|AAK3351 2. 1 j 
gi|1 3621 730lgb|AAK3351 3.1 1 
gi|1 3621 731 |gb|AAK3351 4.1 1 
gi|1 3621 732|gb|AAK3351 5. 1 1 
gi|13621 733|gblAAK33516.1 1 
gi|1 3621 734|gb|AAK3351 7.1 1 
gi|1 3621 735|gb|AAK3351 8. 1 1 
gi|13621 736|gb|AAK3351 9.1 1 
gi|13621 741 |gb|AAK33523.1 1 
gi|13621742|gb|AAK33524.1 1 
gi| 1 3621 743lgb|AAK33525. 1 j 
gi|1 3621744|gblAAK33526.1 1 
gi|1 3621 745|gb| AAK33527.1 1 
gi|1 3621 747|gblAAK33528. 1 1 
gi|13621756|gblAAK33537.1 1 
gi| 1 3621 773|gb|AAK33552. 1 1 
gi| 1 3621 774lgb|AAK33553. 1 1 
gi|1 3621 775 |gb| AAK33554. 1 1 
gi|1 3621 777|gb|AAK33556. 1 1 
gi|1 3621 778|gb|AAK33557. 1 1 
gi|13621779|gblAAK33558.1| 
gi|1 3621 781 |gb|AAK33559.1 1 
gi 1 1 3621 782|gb| AAK33560. 1 1 
gi|13621785|gb|AAK33563.1| 
gil 1 3621 786|gb|AAK33564. 1 1 
gi|1 3621 787|gb|AAK33565.1 1 
gi|13621788|gb|AAK33566.1| 
gi| 1 362 1 789|gb|AAK33567. 1 1 
gi| 1 3621 790|gb|AAK33568. 1 1 
gi|1 3621 793|gb|AAK33571 .1 1 
gi|13621794|gb|AAK33572.1l 



gi|13621796Igb|AAK33573.1| 
gi|13621797|gb|AAK33574.1l 
gi|1 3621 799|gb|AAK33576.1 1 
gi|1 3621 800|gblAAK33577. 1 j 
gi|13621 802|gb|AAK33579.1 1 
gi|1 3621 806|gb|AAK33583.1 1 
gi|1 3621 808|gblAAK33584.1 1 
gi|13621809|gb]AAK33585.1| 
gi| 1 3621 81 0|gb|AAK33586. 1 1 
gi|1 3621 81 1 |gb|AAK33587.1 1 
gi| 1 3621 81 2|gb|AAK33588.1 1 
gi|1 3621 81 3|gb|AAK33589.1 1 
gi|13621814|gb|AAK33590.1| 
gi|1 3621 81 7|gb|AAK33592.1 1 
gi|1 3621 81 8|gb|AAK33593.1 1 
gi|1 3621 81 9|gb|AAK33594.1 1 
gi|13621820|gb|AAK33595.1| 
gi|1 3621 821 |gb|AAK33596.1 1 
gi|1 3621 822lgblAAK33597.1 1 
gi|1 3621 823|gb|AAK33598.1 1 
gi|1 3621 824|gb|AAK33699.1 j 
gi|1 3621 825|gb|AAK33600.1 1 
gi| 1 3621 826|gb|AAK33601 .1 j 
gi|13621828|gb|AAK33602.1| 
gi|13621829|gb|AAK33603.1| 
gi|1 3621 830|gb|AAK33604.1 1 
gi|1 3621 831 |gb|AAK33605.1 [ 
gi|1 3621 834|gb|AAK33608.1 1 
gi|13621835|gb|AAK33609.1| 
gi| 1 3621 836|gb| AAK3361 0. 1 1 
gi|1 3621 837|gb|AAK3361 1 . 1 1 
gil 1 3621 839Igb|AAK3361 2.1 1 
gi|13621840|gblAAK33613.1| 
gi|13621841|gb|AAK33614.1| 
gi|13621 842|gb|AAK3361 5.1 1 
gi|13621 843|gb|AAK3361 6. 1 1 
gi| 1 3621 844|gb|AAK3361 7.1 1 
gi|1 3621 898|gb| AAK33667. 1 1 
gi|1 3621 901 |gb|AAK33670.1 1 
gi| 1 3621 902|gb[AAK3367 1 . 1 1 
gi| 1 3621 904|gb|AAK33672. 1 1 
gi|1 3621 907Jgb|AAK33675.1 1 
gi 1 1 3621 908|gb|AAK33676. 1 j 
gi| 1 3621 909|gb|AAK33677. 1 1 
gi| 1 3621 91 0|gb| AAK33678. 1 1 
gi| 1 3621 91 2|gb|AAK33680. 1 1 
gi| 1 3621 924|gb| AAK33690. 1 1 
9 il13621 929|gb|AAK33694.1 1 
gi|13621930[gblAAK33695.1| 
gi| 1 3621 931 |gb| AAK33696. 1 1 
gi|1 3621 933|gbJAAK33698. 1 j 
gi|13621934|gb|AAK33699.1 1 
gi|1 3621 935|gb|AAK33700.1 1 
gi|1 3621 936|gb|AAK33701 . 1 1 
gi|1 3621937IgbjAAK33702.1 1 
gi|1 3621 938|gb|AAK33703.1 1 
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gi|13621939|gb|AAK33704.1l 
gi|13621942|gb|AAK33706.1| 
gi|13621944Igb|AAK33708.1| 
gi|1 3621 945|gb|AAK33709.1| 
gi|1 3621 946|gb|AAK3371 0.1| 
gi|13621 950|gb|AAK33714.1| 
gi|13621953|gb|AAK33716.1| 
gi|136219S4|gb|AAK33717.1| 
gi| 1 3621 955| gb| AAK3371 8. 1 1 
gi|1 3621 956|gb|AAK3371 9.1 1 
gi|1 3621 957|gb|AAK33720.1 1 
gi|1 3621 958|gb|AAK33721 .1 1 
gi| 1 3621 959|gb|AAK33722. 1 1 
gi|1 3621 961 |gb|AAK33723. 1 1 
gi|1 3621 975|gb|AAK33736.1 1 
gi|1 3621 977lgb|AAK33738.1 1 
gi|1 3621 978|gb|AAK33739.1 1 
gi|1 3621 979|gb|AAK33740. 1 1 
gi|1 3621 980|gb|AAK33741 .1 1 
gi|1 3621 981 |gb|AAK33742.1 1 
gi|13621 982|gb|AAK33743.1 1 
gi|1 3621 985|gbl AAK33745. 1 1 
gi|1 3621 986|gb|AAK33746.1 1 
gi|1 3621 987|gb|AAK337471 1 
gi|1 3621 989|gb| AAK33749.1 1 
gi| 1 3621 990|gb|AAK33750.1 1 
gi|1 3621 992|gb|AAK33752.1 1 
gi|1 3621 993|gb|AAK33753.1 1 
gi|1 3621994|gb|AAK33754.1 1 
gi|13621996|gb|AAK33755.1 1 
gi|1 362 1 997|gblAAK33756. 1 1 
gi|1 3621 998|gb|AAK33757.1 1 
gi|13621999|gb|AAK33758.1| 
gi|1 3622000|gb|AAK33759.1 1 
gi|13622001|gblAAK33760.1| 
gi| 1 3622002 |gb| AAK33761 . 1 1 
gi|13622003Jgb|AAK33762.1 1 
gi|13622004|gb|AAK33763.1 1 
gi|1 3622005|gb|AAK33764.1 1 
gi|1 3622006|gb|AAK33765.1 1 
gi|13622008|gb|AAK33766.1 [ 
gi|13622009|gb|AAK33767.1| 
gi|1 362201 0|gb|AAK33768.1 1 
gi|1 3622012|gb|AAK33770.1 1 
gi|1 362201 3|gb|AAK33771.1 1 
gi| 1 362201 7|gb|AAK33774. 1 1 
gi|1 362201 8|gb|AAK33775.1 1 
gi|13622019|gb|AAK33776.1 
gi|1 3622020|gb|AAK33777.1 
gij 1 3622021 |gb|AAK33778.1 
gi|13622024|gb|AAK33781 .1 
gi|13622025|gb|AAK33782.1 
gi|13622026|gb|AAK33783.1 
gi|13622031 |gb|AAK33787.1 
gi|13622032|gb|AAK33788.1 
gi|13622033|gb|AAK33789.1 



gi|1 3622034|gb|AAK33790.1 1 
gi|1 3622035|gb|AAK33791 .1 1 
gi|13622039|gblAAK33794.1| 
gi|1 3622041 |gb|AAK33796.1 1 
gi| 1 3622042|gb|AAK33797. 1 1 
gi|1 3622043|gb|AAK33798.1 1 
gi|1 3622044|gb|AAK33799.1 1 
gi|1 3622045|gb|AAK33800.1l 
gi|1 3622046|gblAAK33801 .1 1 
gi|1 3622048|gb|AAK33802.1 1 
gi|1 3622049|gb|AAK33803.1 j 
gi|13622050|gb|AAK33804.1| 
gill 3622051 |gb|AAK33805.1 1 
gi|1 3622052|gb|AAK33806. 1 1 
gi|1 3622054|gb|AAK33808.1 1 
gi| 1 3622055|gb| AAK33809. 1 1 
gi|1 3622056|gb|AAK3381 0.1 1 
gi|13622058|gb|AAK33812.1| 
gi|1 3622060|gb|AAK3381 3.1 1 
gi|1 3622062|gb|AAK3381 5.1 1 
gi|1 3622064|gb|AAK3381 7.1 1 
gi|1 3622065|gb|AAK3381 8.1 1 
gi|1 3622068|gb|AAK33821 .1 1 
gi|1 3622069|gb|AAK33822. 1 1 
gi|1 3622070|gb|AAK33823.1 1 
gi|1 3622071 |gb|AAK33824. 1 1 
gi| 1 3622073|gb|AAK33825. 1 1 
gi|13622074lgblAAK33826.1| 
gi|13622075|gb|AAK33827.1| 
gi|1 3622077|gb|AAK33829. 1 1 
gi|1 3622079|gb|AAK33831.1| 
gi|1 3622083|gb|AAK33834. 1 1 
gi|13622085|gb|AAK33836.1| 
gi|1 3622086|gb| AAK33837. 1 1 
gi|13622087|gb|AAK33838.1| 
gi|1 3622088|gb| AAK33839. 1 1 
gi|1 3622089|gblAAK33840. 1 1 
gi|1 3622090|gb|AAK33841 .1 1 
gi|1 3622091 |gb|AAK33842. 1 1 
gi|13622092|gb|AAK33843.1| 
gi| 1 3622093|gb|AAK33844. 1 1 
gi|13622095|gb|AAK33845.1| 
gi|1 3622096|gb|AAK33846.1 1 
gi|13622097|gb|AAK33847.1 1 
gi|13622162|gb|AAK33908.1| 
gi|13622163|gb|AAK33909.1| 
gi|13622164|gb|AAK33910.1| 
gi|13622165|gb|AAK33911.1| 
gi|13622166|gb|AAK33912.1| 
gi|13622169lgbIAAK33914.1| 
gi|13622170|gb|AAK33915.1| 
gi|1 36221 71 |gb|AAK3391 6. 1 1 
gi| 1 3622172|gb|AAK3391 7.1 1 
gi|1 36221 74Igb|AAK3391 9.1 1 
gi|1 3622175|gb|AAK33920.1 1 
gi|13622176|gblAAK33921 .1| 
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Table 33: List of GAS ORFs which are shar d with GBS and Spn 



gi|13622177|gb|AAK33922.1| 
gi|13622179|gblAAK33923.1| 
gi| 1 36221 80|gb|AAK33924. 1 1 
gi|136221 61 |gb|AAK33925.1 1 
gi|1 36221 82|gb|AAK33926.1 1 
gi| 1 36221 83|gb| AAK33927. 1 1 
gi|1 36221 84|gb|AAK33928.1 1 
gi[1 36221 85|gb|AAK33929. 1 1 
gi|1 36221 86|gb|AAK33930.1 1 
gi|1 36221 89|gb|AAK33932.1 1 
gi|13622190|gb|AAK33933.1 1 
gi|1 36221 91 |gb|AAK33934.1 1 
gi|1 36221 92|gb|AAK33935. 1 1 
gi|1 36221 98|gb|AAK33940.1 1 
gill 3622200|gb|AAK33942.1l 
gi|1 3622201 |gb|AAK33943.1 1 
gi|1 3622204|gb|AAK33946.1 1 
gi 1 1 3622205|gb|AAK33947. 1 1 
gi|1 3622207|gb|AAK33949.1 1 
gi|1 3622208|gb|AAK33950. 1 1 
gi|1 362221 1 |gb|AAK33952.1 1 
gi| 1 36222 1 3|gb|AAK33954. 1 1 
gi|1 3622214|gb|AAK33955.1 1 
gi|13622215|gb|AAK33956.1| 
gl|1 3622216|gb]AAK33957.1 1 
gi|13622217|gb|AAK33958.1| 
gij 1 362221 8|gb|AAK33959. 1 1 
gi| 1 362221 9|gbJAAK33960. 1 1 
gij 1 3622222|gb|AAK33962. 1 1 
gi| 1 3622223|gb| AAK33963. 1 1 
gi|13622224|gb|AAK33964.1| 
gi|13622233 gb|AAK33972.1| 
gi|1 3622235 gb]AAK33974.1| 
gi|13622236 gb|AAK33975.1| 
gi| 13622237 gb|AAK33976.1| 
gi|13622239 gb|AAK33978.1l 
gi|13622240 gb|AAK33979.1| 
gi|13622241 gb|AAK33980.1| 
gi|13622242 gb|AAK33981.1 j 
gi|1 3622243 gb|AAK33982.1| 
gi| 13622244 gb|AAK33983.1| 
gi| 1 3622250|gb|AAK33988.1 1 
gi|13622252|gb|AAK33990.1| 
gi|1 3622253|gb|AAK33991 .1 1 
gi| 1 3622255|gb|AAK33993. 1 1 
gi| 1 3622256|gb|AAK33994. 1 1 
gi[1 3622257|gblAAK33995.1 1 
gi| 1 3622259|gb|AAK33996. 1 1 
gi|13622260|gb|AAK33997.1 1 
gi|1 3622261 |gb|AAK33998. 1 1 
gi|13622262|gb|AAK33999.1 1 
gi|13622263|gb|AAK34000.1 1 
gi|1 3622264|gb|AAK34001 .1 1 
gi|1 3622265|gb|AAK34002.1 1 
gi|1 3622266 |gb|AAK34003. 1 1 
gi|1 3622268|gb|AAK34005.1 1 



gi|13622269|gb|AAK340G6.1 1 
gil13622271|gb|AAK34007.1| 
gl|1 3622272|gb|AAK34008. 1 1 
gi|13622273|gb|AAK34009.1| 
gij13622274|gb|AAK3401 0.1 1 
gi|13622276|gb|AAK3401 1 .1 1 
gi| 1 3622276|gb|AAK3401 2. 1 1 
gi|1 3622277|gb|AAK3401 3.1 1 
gi|1 3622278|gb|AAK34014.1 1 
gi|1 3622279|gb|AAK3401 5.1 1 
gi|1 3622281 |gb|AAK3401 7. 1 1 
gi|1 3622282|gb|AAK3401 8. 1 1 
gi|1 3622283|gb|AAK3401 9.1 1 
gi|1 3622284|gb|AAK34020. 1 1 
gi|13622285|gb|AAK34021.1| 
gi|1 3622287|gbIAAK34022. 1 1 
gi|1 3622288|gb|AAK34023. 1 1 
gi|1 3622289|gb|AAK34024.1 1 
gi| 1 3622290|gb|AAK34025. 1 1 
gi|1 3622294|gb|AAK34029. 1 1 
gi|13622295|gb|AAK34030.1| 
gi|1 3622296|gb|AAK34031 .1 1 
gi| 1 3622297|gb|AAK34032. 1 1 
gi|1 3622298|gb|AAK34033. 1 1 
gi| 1 3622299|gb|AAK34034. 1 1 
gill 3622301 |gb| AAK34035. 1 1 
gi|1 3622306|gb|AAK34040.1 1 
gi| 1 3622326|gb|AAK34058. 1 1 
gij 1 3622 328|gbl AAK3406 0. 1 1 
gi|1 3622329|gb|AAK34061 .1 1 
gi|13622330lgb|AAK34062.1| 
gi|13622332|gblAAK34064.1 1 
gi|13622333|gb|AAK34065.1 1 
gi|1 3622335|gb|AAK34066.1 1 
gi|13622338|gb|AAK34069.1 1 
gi| 1 3622339jgb|AAK34070. 1 1 
gi| 1 3622340|gblAAK34071 . 1 1 
gill 3622341 |gb|AAK34072.1 1 
gi|1 3622343|gb|AAK34073. 1 1 
gi| 1 3622350|gblAAK34080. 1 1 
gi|1 3622351 lgb|AAK34081 .1 1 
gi| 1 3622352lgb|AAK34082. 1 \ 
gi|1 3622353|gblAAK34083. 1 1 
gi| 1 3622355|gb| AAK34084. 1 1 
gi|1 3622356|gb|AAK34085. 1 1 
gi| 1 3622357]gb|AAK34086. 1 1 
gi|13622358|gb|AAK34087.1| 
gi| 1 3622359|gb|AAK34088. 1 1 
gi| 1 3622360|gblAAK34089. 1 1 
gi!13622361 |gb|AAK34090.1| 
gi|1 3622362lgb|AAK34091 .1 1 
gi|13622363|gb|AAK34092.1| 
gi |1 3622364|gb|AAK34093. 1 1 
gi|13622366lgb(AAK34094.1 1 
gi|13622367|gblAAK34095.1| 
gi|1 3622368|gblAAK34096. 1 1 
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Tabl 33: List f GAS ORFs which are shar d with GBS and Spn 



gi|13622369lgb|AAK34097.1 1 
gi|1 3622370|gb|AAK34098.1 1 
gi|13622371|gb|AAK34099.1| 
gi|13622372|gb|AAK34100.1| 
gi|1 3622373|gblAAK34101 .1 1 
gi|1 3622374|gb|AAK341 02.1 1 
gi|13622375|gb|AAK34103.1 1 
gi|1 3622376|gb|AAK341 04.1 1 
gi|1 3622377|gb|AAK34105.1 1 
gi|13622378|gb|AAK34106.1| 
gi|13622380|gblAAK34107.1| 
gi|1 3622383|gb|AAK341 1 0.1 1 
gi|1 3622384|gb|AAK341 1 1 .1| 
gi|13622387(gb|AAK341 14.11 
gi|13622389|gb|AAK341 16.1| 
gi[13622394|gb|AAK34120.1l 
gi|1 3622395|gb|AAK34121 .1| 
gi|1 3622396|gb|AAK34122.1 1 
gi|1 3622398|gb|AAK34124.1| 
gi|13622399|gblAAK34125.1| 
gi| 1 3622400|gb|AAK341 26.1 1 
gi|1 3622401 |gb|AAK34127.1 1 
gi|13622403|gb|AAK34128.1l 
gi|13622405|gb|AAK341 30.1 1 
gi|1 3622406|gblAAK341 31 .1 1 
gi| 1 3622407|gb|AAK341 32. 1 1 
gi|13622408|gb|AAK34133.1| 
gi| 1 3622415|gb|AAK341 39.1 1 
gi|13622416|gb|AAK34140.1| 
gi|13622417|gb|AAK34141 .1| 
gi|13622419|gb|AAK34143.1 1 
gi|13622420|gb|AAK34144.1| 
gi|13622424|gblAAK34147.1| 
gi|13622425|gb|AAK34148.1| 
gi|13622431|gb|AAK34153.1| 
gi|13622432|gblAAK34154.1| 
giJ1 3622433|gblAAK341S5.1 1 
gi|1 3622434|gb|AAK341 56.1 1 
gl|13622435|gb|AAK341 57.1| 
gi|1 3622436|gb|AAK341 58.1 1 
gi|13622437|gb|AAK34159.1| 
gi|13622444|gb|AAK34165.1| 
gi| 1 3622447|gb|AAK341 68.1 1 
gi|13622450|gb|AAK34170.1| 
gi|1 3622451 |gb|AAK341 71 . 1 1 
gi| 1 3622455|gb|AAK341 75.1 1 
gi|1 3622457|gb|AAK341 77.1 1 
gi|1 3622458|gb|AAK341 78.1 1 
gi| 1 3622460|gb| AAK341 79.1 1 
gi|1 3622461 |gb|AAK341 80.1 1 
gi|1 3622462|gb|AAK341 81 .1 1 
gi|1 3622463|gb|AAK341 82.1 1 
gi|1 3622464|gb|AAK341 83.1 1 
gi|1 3622465|gb|AAK341 84.1 1 
gi|1 3622467|gb|AAK341 86.1 1 
gi|1 3622468|gb|AAK341 87.1 1 



gi|1 3622471 |gb|AAK341 89. 1 1 
gi| 1 3622473|gb|AAK341 91 . 1 1 
gi|1 3622474|gb|AAK341 92. 1 1 
gi|1 3622477lgb|AAK34195.1 1 
gi|13622478|gb|AAK341 96.1 1 
gi|1 3622479|gblAAK341 97.1 1 
gi|1 3622481 |gb|AAK34198.1 1 
gl|1 3622482|gb|AAK34199.1 1 
gi|1 3622483|gb|AAK34200.1 1 
gi|1 3622484|gb|AAK34201 . 1 1 
gi|1 3622485|gb|AAK34202.1 1 
gi|13622486|gb|AAK34203.1 1 
gi|1 3622491 |gblAAK34207.1| 
gi|1 3622492|gb|AAK34208. 1 1 
gi|13622493|gb|AAK34209.1l 
gi|1 3622494|gb|AAK3421 0. 1 1 
gi|1 3622495|gb|AAK3421 1 .1 1 
gi|1 3622496|gb|AAK34212.1 1 
gi|1 3622497|gb|AAK3421 3.1 1 
gi|1 3622499|gb|AAK34214.1 1 
gi|1 3622500|gb|AAK3421 5.1 1 
gi|1 3622501 |gb|AAK3421 6. 1 1 
gi|13622506|gb|AAK34221.1l 
gi|13622507|gb|AAK34222.1| 
gi|1 3622508|gb|AAK34223.1 1 
gi|13622509|gb|AAK34224.1l 
gi|1 362251 1 |gb|AAK34225. 1 1 
gi|1 362251 2|gbJAAK34226. 1 1 
gi|1 3622513|gb|AAK34227.1| 
gi|1362251 5|gb|AAK34229.1 1 
gi|1 362251 6|gb| AAK34230. 1 1 
gi|1 3622517|gb|AAK34231 .1 1 
gi|1362251 8|gb|AAK34232.1 1 
gi|1 3622520|gb|AAK34233.1 1 
gi| 1 3622521 |gb|AAK34234.1 1 
gi|1 3622523|gb|AAK34236.1 1 
gi|13622524|gb|AAK34237.1| 
gi| 1 3622525|gb|AAK34238. 1 1 
gi|1 3622526|gblAAK34239.1 1 
gi|13622527|gb|AAK34240.1| 
gi|1 3622579|gblAAK34289. 1 1 
gi|1 3622583|gb|AAK34292.1 1 
gi|13622585|gb|AAK34294.1| 
gi|1 3622587|gb|AAK34296.1 1 
gi|13622588|gb|AAK34297.1| 
gi|1 3622590|gb|AAK34299. 1 1 
gi|1 3622591 |gblAAK34300.1 1 
gi|1 3622593|gb|AAK34301 .1 1 
gil13622595|gb|AAK34303.1| 
gi|13622596|gb|AAK34304.1 1 
gi|1 3622597|gb|AAK34305.1 1 
gi|13622598lgb|AAK34306.1 1 
gi|13622599|gb|AAK34307.1| 
gi| 1 3622600|gb|AAK34308.1 1 
gil13622601|gb|AAK34309.1| 
gi|1 3622603|gb|AAK3431 0.1 1 
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Table 33: List f GAS ORFs which are shared with GBS and Spn 



gi|13622604|gb|AAK3431 1 .1 1 
gi|1 3622606|gb|AAK3431 3.1 1 
gi|1 3622607|gb|AAK3431 4.1 1 
gi|13622608|gb|AAK34315.1| 
gi|13622609lgblAAK34316.1| 
gi|13622610|gb|AAK34317.1| 
gi|1362261 1|gblAAK34318.1| 
gi|1362261 2|gb|AAK3431 9.1 1 
gi|13622615|gb|AAK34321.1 1 
gi|13622616lgb|AAK34322.1| 
gi|1 362261 7|gb|AAK34323.1 1 
gi|13622618|gb|AAK34324.1 1 
gi|13622621 |gb|AAK34327.1 1 
gi|13622622|gb|AAK34328.1| 
gi|13622623|gb|AAK34329.1 1 
gi|13622624|gb|AAK34330.1 1 
gi|1 3622625|gb|AAK34331 .1 j 
gi|13622626|gb|AAK34332.1 1 
gi|13622628|gb|AAK34333.1| 
gi|13622629|gb|AAK34334.1| 
gi|13622630|gb|AAK34335.1 1 
gi|13622631|gb|AAK34336.1 1 
gi|13622632|gb|AAK34337.1 1 
gi|13622634|gblAAK34339.1| 
gi|1 3622636|gb|AAK34341 .1 1 
gi|13622640|gb|AAK34344.1| 
gi|1 3622641 |gb|AAK34345.1 1 
gi|13622652|gb|AAK34355.1| 
gi|1 3622653|gblAAK34356. 1 1 
gi|13622654|gb|AAK34357.1 1 
gi|13622656|gblAAK34359.1| 
gi|13622660|gb|AAK34363.1| 
gi|1 3622665|gblAAK34367. 1 1 
gi|1 3622668|gb|AAK34370. 1 1 
gi|13622675|gb|AAK34376.1| 
gi|1 3622676lgb|AAK34377.1 1 
gi|1 3622683|gb|AAK34383.1| 
gi|13622684|gb|AAK34384.1| 
gi|13622685|gblAAK34385.1| 
gi|13622688|gb|AAK34387.1| 
gi|1 3622689|gb|AAK34388.1 1 
gi|1 3622690|gb|AAK34389.1 1 
gi|1 3622691 |gb|AAK34390.1 1 
gi|1 3622692|gb|AAK34391 .1 1 
gi| 1 3622693|gb| AAK34392. 1 1 
gl|13622694|gb|AAK34393.1| 
gi|1 3622695|gb|AAK34394.1| 
gi|1 3622696|gb|AAK34395.1 1 
gi|1 3622698|gb|AAK34396.1 
gi|1 3622699|gb|AAK34397.1 
gi|1 3622700|gb|AAK34398.1 
gi|1 3622701 |gb|AAK34399.1 
gi|1 3622702|gb|AAK34400.1 
gij1 3622703|gb|AAK34401 .1 
gi|1 3622704|gb|AAK34402. 1 
gi|13622705lgb|AAK34403.1 



gi|1 362271 1 |gb|AAK34408.1| 
gi|1 362271 3|gb|AAK3441 0. 1 1 
gi|13622714|gb|AAK34411.1| 
gi|1 362271 5|gb|AAK3441 2. 1 1 
gi|1 362271 8|gb|AAK3441 4. 1 1 
gi|1 362271 9|gb|AAK3441 5.1 1 
gi|1 3622720|gb|AAK3441 6.1 1 
gi|13622721 |gb|AAK34417.1| 
gi|13622722|gb|AAK34418.1| 
gi|1 3622723|gb|AAK3441 9.1 1 
gi|1 3622727|gb|AAK34422.1| 
gi|1 3622728|gb|AAK34423.1 1 
gi|13622729|gb|AAK34424.1| 
gi|13622730|gb|AAK34425.1| 
gi|1 3622731 |gb|AAK34426.1 1 
gi| 1 3622733|gblAAK34428.1 1 
gi|1 3622734|gb|AAK34429.1 1 
gi|1 3622735lgb|AAK34430.1 1 
gi|1 3622736|gb|AAK34431 .1 1 
gi|1 3622737|gb|AAK34432.1 1 
gi|1 3622740|gb|AAK34434.1 1 
gi|13622741 |gb|AAK34435.1 1 
gi|13622742|gb|AAK34436.1 1 
gi|1 3622744|gb|AAK34438.1 1 
gi|13622745|gb|AAK34439.1 1 
gi|1 3622746|gb|AAK34440.1 1 
gi|1 3622749|gb|AAK34442.1 1 
gill 3622750|gb|AAK34443.1 1 
gi|1 3622751 |gb|AAK34444.1 1 
gi|1 3622752|gb|AAK34445.1 1 
gi|1 3622753|gb|AAK34446.1 1 
gi|1 3622754|gb|AAK34447.1 1 
gi|13622760|gb|AAK34452.1| 
gi|13622762|gb|AAK34454.1 1 
gi|13622763|gb|AAK34455.1| 
gi|13622764|gb|AAK34456.1| 
gi|1 3622765|gb|AAK34457. 1 1 
gi| 1 3622766|gb|AAK34458.1 1 
gi|13622767|gb|AAK34459.1 1 
gi|1 3622768|gb|AAK34460. 1 1 
gi|13622770|gb|AAK34462.1| 
gi| 1 3622771 |gbJAAK34463. 1 1 
gi|13622774|gb|AAK34465.1| 
gi|13622775|gb|AAK34466.1| 
gi|13622776|gb|AAK34467.1| 
gi|13622777|gblAAK34468.1| 
gi|13622778|gb|AAK34469.1l ^ 
gi| 1 3622779|gb|AAK34470. 1 1 
gi|13622780|gb|AAK34471 .1| 
gi|1 3622781 |gb|AAK34472. 1 1 
gi(13622782lgb|AAK34473.1| 
gi| 1 3622783|gb|AAK34474. 1 1 
gi|1 3622785|gb|AAK34475.1 1 
gi|1 3622787|gb|AAK34477.1 1 
gi|13622789|gb|AAK34479.1| 
gi|13622790|gb|AAK34480.1| 




Tabl 33: List f GAS ORFs which ar shared with GBS and Spn 



gi|1 3622791 |gb|AAK34481 .1 1 
gi|1 3622792|gb| AAK34482. 1 1 
gi|13622793|gb|AAK34483.1| 
gi|1 3622794|gb|AAK34484.1 1 
gi|1 3622795|gb|AAK34485. 1 1 
gi|1 3622796|gb|AAK34486.1 1 
gi|13622798|gb|AAK34487.1| 
gi|13622799|gb|AAK34488.1 1 
gi|13622800|gblAAK34489.1| 
gi|13622801 lgb|AAK34490.1 1 
gI|13622802|gb|AAK34491.1 1 
gl|13622803|gb|AAK34492.1 1 
gi|1 3622804|gb|AAK34493.1 1 
gi|13622805|gb|AAK34494.1| 
gi|13622806|gb|AAK34495.1| 
gi|1 3622807|gb|AAK34496.1 1 
gi|13622808|gb|AAK34497.1| 
gi|13622809|gb|AAK34498.1| 
gi|1 362281 0|gb|AAK34499.1 1 
gi|1 362281 2|gb|AAK34500. 1 1 
gi|13622813|gb|AAK34501 .1 1 
gi|13622814|gb|AAK34502.1| 
gill 362281 5|gb|AAK34503.1| 
gi|13622818|gb|AAK34506.1| 
gi|13622821|gb|AAK34509.1| 
gi|1 3622822|gblAAK3451 0.1 1 
gi|13622823|gb|AAK3451 1.1| 
gi| 1 3622825|gb|AAK3451 2. 1 1 
gi|1 3622826|gb|AAK34513. 1 1 
gi| 1 3622827|gb|AAK3451 4.1 1 
gi| 1 3622828|gb|AAK345 1 5. 1 1 
gi|13622829|gb|AAK34516.1| 
gi|13622830|gb|AAK34517.1| 
gi|13622833|gb|AAK34520.1| 
gi|13622838|gb|AAK34524.1| 
gi|13622839|gb|AAK34525.1| 
gi|13622840|gb|AAK34526.1| 
giJ1 3622841 |gb|AAK34527.1 1 
gi|13622847|gb|AAK34532.1| 
gi|13622848|gb|AAK34533.1| 
gi|13622849|gblAAK34534.1| 
gi| 1 3622853|gb|AAK34537. 1 1 
gi|13622854|gb|AAK34538.1l 
gi|13622856|gbJAAK34540.1| 
gi|1 3622857|gb|AAK34541 .1 1 
gi|13622858|gb|AAK34542.1| 
gil13622860|gblAAK34543.1| 
gi|13622861 |gb|AAK34544.1| 
gi|13622862|gb|AAK34545.1| 
gi|1 3622863|gb|AAK34546.1 1 
gi|1 3622864|gb|AAK34547.1 1 
gi|13622865|gb|AAK34548.1 1 
gi| 1 3622866|gb| AAK34549. 1 1 
gi|13622867|gb|AAK34550.1| 
gi|1 3622868|gb|AAK34551 .1 1 
gi| 1 3622869|gb|AAK34552. 1 1 



gi|1 3622870|gb|AAK34553.1 1 
gi|1 3622873lgblAAK34555.1 1 
gill 3622875|gb|AAK34557.1 1 
gi|1 3622876|gb|AAK34558.1 1 
gi|1 3622877|gb|AAK34559.1 1 
gi|1 3622878|gb|AAK34560.1 1 
gi|1 3622879|gb|AAK34561 .1 1 
gi|13622880|gb|AAK34562.1| 
gi|1 3622881 |gb|AAK34563.1 1 
gi|1 3622882|gblAAK34564.1 1 
gi|1 3622885|gb|AAK34566.1 1 
gi|13622886|gb|AAK34567.1l 
gi|13622887|gblAAK34568.1| 
gi|1 3622888|gb|AAK34569.1 1 
gi|1 3622890|gb|AAK34571 .1 1 
gi|1 3622893|gb|AAK34574.1 1 
gi|1 3622896|gb|AAK34576.1 1 
gi|1 3622898|gb|AAK34578.1 1 
gi|1 3622899lgb|AAK34579.1 1 
gi|1 3622900|gb| AAK34580. 1 1 
gi|1 3622901 |gb|AAK34581.1| 
gi|1 3622903|gb|AAK34S83.1 1 
gi|13622905|gb|AAK34585.1| 
gi|1 3622906|gb|AAK34586.1 1 
gi|1 3622907|gb|AAK34587.1 1 
gi|13622908|gb|AAK34588.1| 
gi|1 362291 0|gb|AAK34589.1 1 
gi|1 362291 1 |gb|AAK34590.1 1 
gi| 1 362291 2|gb|AAK3459 1 . 1 1 
gi|1 362291 3|gb|AAK34592.1 1 
gi| 1 362291 4|gb|AAK34593. 1 1 
gi|1 362291 5|gb|AAK34594.1 1 
gi|1 362291 7|gb|AAK34596.1 1 
gill 362291 8|gb|AAK34597.1| 
gi|1 362291 9lgb|AAK34598.1 1 
g!|1 3622921 |gb|AAK34599. 1 1 
gi|13622922|gb|AAK34600.1| 
gi|1 3622924|gb|AAK34602. 1 | 
gi|1 3622925|gb| AAK34603. 1 1 
gi|1 3622926|gblAAK34604. 1 1 
gi|13622927|gb|AAK34605.1| 
gi|1 3622928|gb| AAK34606. 1 1 
gi|13622929|gb|AAK34607.1| 
gi|13622930|gb|AAK34608.1| 
gi| 1 3622931 |gb| AAK34609. 1 1 
gi|13622933|gb|AAK34610.1| 
gill 3622941 |gb|AAK3461 7.11 
gi|13622944lgb|AAK34620.1| 
gi|1 3622945|gb|AAK34621 .1| 
gil13622947lgb|AAK34623.1| 
gi| 1 3622948lgb|AAK34624. 1 1 
gi|13622949|gb|AAK34625.1| 
gi| 1 3622950|gb|AAK34626. 1 1 
gi| 1 3622952JgblAAK34627. 1 1 
gil13622955|gb|AAK34630.1| 
gi|13622956|gb|AAK34631 .1| 
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Tabl 33: List f GAS ORFs which ar shared with GBS and Spn 



gi|13622959|gb|AAK34634.1 1 
gi|1 3622961 |gblAAK34636.1| 
gi|1 3622963|gblAAK34638.1 1 
gi|1 3622964|gb|AAK34639.1 1 
gi|1 3622967|gb|AAK34641 .1 1 
gi|1 3622969Jgb|AAK34643.1 1 
gi|1 3622971 |gb|AAK34645.1 1 
gi| 1 3622973|gb|AAK34647.1 1 
gi|13622974|gb|AAK34648.1| 
gi|1 3622977|gb|AAK34651 .1 1 
gi|13622981|gb|AAK34654.1| 
gi| 1 3622982|gb|AAK34655. 1 1 
gi|1 3622983|gb|AAK34656.1 1 
gi|13622984|gb|AAK34657.1| 
gt|13622985|gb|AAK34658.1| 
gi|1 3622989|gbJAAK34661 .1 1 
gi| 1 3622990lgb|AAK34662. 1 1 
gi|1 3622991 |gb|AAK34663.1 1 
gi|1 3622992|gb|AAK34664.1 1 
gi| 1 3622995|gb|AAK34666.1 1 
gi| 1 3622996|gblAAK34667. 1 1 
gi| 1 3622998|gb|AAK34669. 1 1 
gi|13622999|gb|AAK34670.1| 
gi|1 3623000|gb|AAK34671 .1 1 
gi|13623001|gb|AAK34672.1| 
gi|13623002|gblAAK34673.1| 
gill 3623004|gb|AAK34674.1 1 
gil13623005|gb|AAK34675.1| 
gi|1 3623006|gb|AAK34676.1 1 
gi| 1 3623007|gb|AAK34677. 1 1 
gi|1 3623009|gblAAK34679.1 1 
gi|1 3623019|gb|AAK34688.1 
gi|1 3623020|gb|AAK34689.1 
gi|1 3623030|gb|AAK34698.1 
gi|1 3623031 |gb|AAK34699.1 
gi|1 3623032|gb|AAK34700.1 
gi|1 3623033|gb|AAK34701 .1 
gi|1 3623038|gb|AAK34705. 1 1 



gill 3623045 



gb|AAK34712.1| 



gi|13623046 gbJAAK34713.1| 
gi|13623047 gb|AAK34714.1| 
gi|1 3623049 gb|AAK34715.1| 
gi|13623050 gb|AAK34716.1| 
gil13623051 gb|AAK34717.1| 
gi|1 3623052|gblAAK3471 8.1| 
gi| 1 3623053|gb|AAK3471 9.1 1 
gil13623054|gblAAK34720.1| 
gi[1 3623056|gblAAK34722.1 1 
gi|1 3623058|gb|AAK34724.1 1 
gi|13623062|gb|AAK34727.1| 
gi|13623064|gb|AAK34729.1| 
gi|13623065lgblAAK34730.1| 
gi|13623069|gblAAK34733.1| 
gi!13623074|gb|AAK34738.1| 
gi|13623081 |gblAAK34744.1| 
gi|1 3623082|gb|AAK34745. 1 1 



gi|13623083|gb|AAK34746.1| 
gil13623085|gblAAK34747.1l 
gi|1 3623086|gblAAK34748.1 1 
gi| 1 3623088lgb|AAK34750. 1 1 
gi|1 3623089|gb|AAK34751 .1 1 
gi|13623090lgblAAK34752.1| 
gi|1 3623091 |gblAAK34753. 1 1 
gil13623093|gb|AAK34755.1| 
gi| 1 3623095|gblAAK34756. 1 1 
gi!13623096|gb|AAK34757.1| 
gi| 1 3623098|gbl AAK34759. 1 1 
gi|13623099|gb|AAK34760.1 1 
gi|13623100lgb|AAK34761.1j 
gi|13623102|gblAAK34763.1| 
gil13623103|gb|AAK34764. 1 1 
gi| 1 36231 05lgb|AAK34766. 1 1 
gil13623107|gb|AAK34767.1| 
gi|1 36231 28|gb| AAK34787. 1 1 
gill 36231 29|gb|AAK34788. 1 1 
gi|13623131 |gblAAK34790.1l 
gi|13623132lgb|AAK34791 .1| 
gill 36231 33|gb|AAK34792. 1 1 
gi|1 36231 34|gblAAK34793.1 1 
gill 36231 36|gb|AAK34794.1 1 
gi|1 36231 38|gb|AAK34796.1 \ 
gi|13623139|gb|AAK34797.1| 
gi|1 36231 50|gb|AM<34807.1 1 
gi|1 36231 51 lgblAAK34808. 1 1 
gi|13623152Jgb|AAK34809.1| 
gi|13623154|gb|AAK3481 1.1| 
gill 3623155lgblAAK3481 2.1 1 
gi|1 36231 56|gb|AAK3481 3. 1 1 
gi|13623157|gblAAK34814.1| 
gi|1 3623159lgb|AAK3481 5. 1 1 
gi|13623161|gblAAK34817.1l 
gil13623162lgb|AAK3481 8.1| 
gi|1 3623163|gblAAK3481 9. 1 1 
gi|13623165|gblAAK34821.1| 
gi| 1 3623166lgb|AAK34822.1 1 
gi|13623167|gb|AAK34823.1l 
gi(1 3623168|gblAAK34824.1 1 
gi|13623170|gblAAK34826.1 1 
gi|13623171|gb|AAK34827.1| 
gil13623175lgb|AAK34830.1| 
gi|1 3623176|gb|AAK34831 .1 1 
gi|1 36231 77|gblAAK34832. 1 1 
gi|13623179lgb|AAK34834.1| 
gi|13623180|gb|AAK34835.1 1 
gi|13623182|gb|AAK34836.1 1 
gi|13623183lgb|AAK34837.1| 
gi|13623184|gblAAK34838.1l 
gi|13623185|gblAAK34839.1l 
gi|13623186|gb|AAK34840.1 1 ■ 
gi|13623187lgblAAK34841.1 1 
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Tabl 34: Ust of GAS ORFs which are shared with GBS but not with Spn 



gi|1 3621 381 |gb|AAK331 95. 1 1 
gi|13621423|gb|AAK33233.1| 
gi|13621440|gb|AAK33249.1| 
gi|13621443|gb|AAK33251 .11 
gi|13621453|gb|AAK33260.1| 
gi|1 3621454|gb|AAK33261 .1 1 
gi|1 3621 479|gb|AAK33284.1 1 
gi|1 3621 482|gb|AAK33287.1 J 
gi 1 1 3621 492|gb | AAK33296. 1 1 
gi|1 3621493|gb|AAK33297.1 1 
gi|1 362 1 497|gb|AAK33300. 1 1 
gill 3621498|gb|AAK33301 . 1 1 
gl|13621512|gb|AAK33314.1| 
gi|1 3621 514|gb|AAK3331 6.1 1 
gi|1 3621 556|gb|AAK33354.1 1 
gi|1 3621 570|gb|AAK33366. 1 1 
gi|1 3621 S87|gb|AAK33382.1 1 
gi|13621610|gb]AAK33403.1 1 
gil13621613|gb|AAK33405.1| 
gi|1 3621 626|gb|AAK3341 8.1 1 
gi|13621632|gb|AAK33423.1| 
gi|1 3621 635|gb|AAK33426.1 1 
gi| 1 3621 643|gb| AAK33433. 1 1 
gi|1 3621 655|gb|AAK33444. 1 1 
gi 1 1 3621 656|gb|AAK33445. 1 1 
gi|1 3621 659lgb|AAK33448. 1 1 
gi|13621673|gb|AAK33461.1| 
gi|1 3621 686|gb|AAK33473.1 1 
gi|1 3621 696|gb|AAK33482.1 1 
gi|1 3621 703|gb|AAK33488.1 1 



gil13621712|gb 



AAK33497.1| 



gi|13621728|gb AAK33511.1| 
gi|13621738|gb AAK33520.1| 



gi|13621739|gb 
gi|13621740|gb 
gi|13621772|gb 
gi|13621776|gb 
gi|13621791|gb 



AAK33521.1| 
AAK33522.1| 
AAK33551.1| 
AAK33555.1| 
AAK33569.1| 



gi| 1 3621 798|gblAAK33575. 1 1 
gi[1 3621 801 |gb|AAK33578.1 1 
gi| 1 3621 803|gb|AAK33580. 1 1 
gi| 1 3621 804|gb|AAK3358 1 . 1 1 
gij 1 3621 832 |gb|AAK33606. 1 1 
gi|13621833|gblAAK33607.1| 
gi| 1 3621 896| gb| AAK33665. 1 1 
gi|13621897|gb|AAK33666.1| 
gi| 1 3621 906|gb|AAK33674. 1 1 
gi|1362191 1|gb|AAK33679.1| 
gi|1 3621 949|gb|AAK3371 3.1 1 
gi|13621951|gb|AAK33715.1| 
gi| 1 3621 962|gb| AAK33724. 1 1 
gi| 1 3621 963|gb|AAK33725. 1 1 
gi| 1 3621 964|gb|AAK33726. 1 1 
gl|1 3621971 |gb|AAK33732.1 1 
gi|13621976|gb|AAK33737.1| 
gi[13621983|gb|AAK33744.1| 



gi| 1 3621 988|gb|AAK33748. 1 1 
gi|13622014|gb|AAK33772.1| 
gi| 1 362201 5|gb|AAK33773.1 1 
gij 1 3622022|gb|AAK33779. 1 1 
gi|13622023|gb|AAK33780.1| 
gi|13622028|gb|AAK33784.1 1 
gi| 1 3622029lgb|AAK33785. 1 1 
gi| 1 3622037|gb|AAK33792.1 1 
gi|13622038|gb|AAK33793.1| 
gi| 1 3622040|gb| AAK33795. 1 1 
gi| 1 3622057|gb| AAK3381 1 . 1 1 
gi|1 3622061 |gb|AAK33814.1 1 
gi|13622063|gb|AAK33816.1| 
gi|13622066|gb|AAK33819.1| 
gi|13622067|gblAAK33820.1| 
gi| 1 3622076|gb|AAK33828. 1 1 
gi|13622078|gb|AAK33830.1| 
gi| 1 3622084|gb|AAK33835. 1 1 
gi| 1 3622098|gb|AAK33848. 1 1 
gi|13622099|gb|AAK33849.1| 
gi|1 36221 00|gb|AAK338S0.1 1 
gi| 1 36221 04|gb|AAK33854. 1 1 
gi|1 36221 1 0|gb|AAK33859.1 1 
gi|13622116|gb|AAK33865.1| 
gi|13622124|gb|AAK33873.1| 
gi| 1 36221 59|gb|AAK33905. 1 1 
gi| 1 36221 93|gb|AAK33936. 1 
gi|13622194|gb|AAK33937.1 
gi|1 36221 95|gb|AAK33938.1 
gi| 1 36221 96|gb|AAK33939. 1 
gi|13622202lgb|AAK33944.1 
gi|13622203|gblAAK33945.1 
gi|13622206|gb|AAK33948.1 
gi|13622210|gb|AAK33951.lj 
gi| 1 3622221 |gblAAK33961 . 1 1 
gi|13622231|gb|AAK33971.1| 
gi| 1 3622234|gb|AAK33973. 1 1 
gi|1 3622238|gb|AAK33977.1 1 
gi|1 3622245|gb|AAK33984. 1 1 
gi|13622246|gb|AAK33985.1| 
gi|1 3622248|gb|AAK33986.1 1 
gi|1 3622249|gb|AAK33987.1 1 
gi|1 3622251 |gb|AAK33989. 1 1 
gi|1 3622254|gb|AAK33992.1 1 
gl|13622267|gb|AAK34004.1| 
gi|1 3622291 |gb|AAK34026.1 1 
gi|1 3622302|gb|AAK34036. 1 1 
gij 1 3622303|gb| AAK34037. 1 j 
gi|1 3622304|gb|AAK34038.1| 
gi|1 3622327|gb|AAK34059. 1 1 
gi|1 3622344|gb|AAK34074.1 1 
gi|1 3622345|gb|AAK34075. 1 1 
gi|13622346|gb|AAK34076.1| 
gi|1 3622347|gb|AAK34077. 1 1 
gi|13622348|gb|AAK34078.1| 
gi|1 3622349|gblAAK34079.1| 
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Tabl 34: List of GAS ORFs which ar shared with GBS but not with Spn 



gi|13622382|gb|AAK34109.1| 
gi|1 3622386|gblAAK341 1 3.1| 
gi|13622391|gb|AAK341 18.1| 
gi|13622392|gb|AAK341 19.1| 
gi|13622397|gb|AAK34123.1| 
gi|1 3622404|gb|AAK341 29.1 1 
gi|1 362241 2|gb|AAK341 36. 1 1 
gill 362241 3|gb| AAK341 37.1 1 
gi|1 3622414|gb|AAK341 38.1 1 
gi|1 362241 8|gb|AAK34142.1 1 
gi|1 3622430|gb|AAK341 52.1 1 
gi|1 3622446|gb|AAK34167.1 1 
gi|13622449|gb|AAK34169.1| 
gi|13622453|gb|AAK34173.1| 
gi| 1 3622470|gblAAK341 88.1 1 
gi|13622487|gblAAK34204.1| 
gi|1 3622490|gb|AAK34206.1 1 
gi|1 3622502|gb|AAK3421 7.1 1 
gi|1 3622503|gb|AAK3421 8.1 1 
gi|1 3622514|gb|AAK34228.1 1 
gi|1 3622528|gb|AAK34241 .1 1 
gi|13622540|gb|AAK34252.1 1 
gi| 1 3622541 |gb|AAK34253.1 1 
gi|13622544|gb|AAK34255.1| 
gi|13622545|gb|AAK34256.1| 
gi| 1 3622546|gb|AAK34257.1 1 
gi|1 3622547|gb|AAK34258.1 1 
gi|1 3622548|gb|AAK34259. 1 1 
gij 1 3622550|gb|AAK34261 . 1 1 
gi|1 3622551 |gb|AAK34262.1 1 
gi|13622552|gb|AAK34263.1| 
gi|1 3622556|gb|AAK34267. 1 1 
gi| 1 3622557|gb| AAK34268. 1 1 
gi|1 3622558|gb|AAK34269.1 1 
gi| 1 3622559|gb|AAK34270. 1 1 
, gi|13622563|gb|AAK34273.1| 
gi|13622571 |gb|AAK34281.1| 
gi|1 3622576|gblAAK34286. 1 1 
gi|13622581|gb|AAK34290.1| 
gi|13622582|gb|AAK34291 .1| 
gl|1 3622586|gb|AAK34295.1 1 
gi|13622589lgb|AAK34298.1| 
gi|13622605|gb|AAK34312.1| 
gi|13622633|gb|AAK34338.1| 
gi|13622635|gb|AAK34340.1| 
gi|1 3622637|gb|AAK34342. 1 1 
gi|1 3622638lgb|AAK34343.1 1 
gi|13622657|gb|AAK34360.1| 
gi|13622707|gb|AAK34404.1| 
gi|1 362271 6|gb]AAK3441 3. 1 1 
gi|13622724|gb|AAK34420.1| 
gi| 1 3622732|gb| AAK34427. 1 1 
gi|13622743|gb|AAK34437.1| 
gi|13622761 |gb|AAK34453.1| 
gi|13622773|gb|AAK34464.1| 
gi|13622788|gb|AAK34478.1| 



gi|13622816|gb|AAK34504.1 1 
gi|13622817|gb|AAK34505.1| 
gi| 1 3622846|gb|AAK34531 . 1 1 
gi|13622852|gb|AAK34536.1| 
gi| 1 3622874|gb|AAK34556. 1 1 
gi|13622889|gb|AAK34570.1 1 
gil13622891|gblAAK34572.1l 
gi| 1 3622892 jgb|AAK34573. 1 1 
gi| 1 3622897|gb|AAK34577. 1 1 
gi| 1 3622902|gb|AAK34582. 1 1 
gi|1 3622904|gb|AAK34584.1 1 
gi|1 3622916|gb|AAK34595.1 1 
gi|13622923|gb|AAK34601 .1| 
gi|13622934|gb|AAK3461 1 .1| 
gi|13622953|gb|AAK34628.1| 
gi|13622954|gb|AAK34629.1| 
gi|13622960|gb|AAK34635.1| 
gi|13622968|gb|AAK34642.1| 
gi|13622980|gb|AAK34653.1| 
gi|13622987|gb|AAK34659.1| 
gi|13623012|gb|AAK34682.1| 
giJ1 362301 3|gb|AAK34683.1 1 
gi|1 362301 4|gb|AAK34684.1 1 
gi|13623015|gb|AAK34685.1| 
gi| 1362301 6|gb|AAK34686.1 1 
gi|13623018|gb|AAK34687.1| 
gi|13623022|gblAAK34691 .1| 
gi|13623029|gb|AAK34697.1| 
gi|1 3623037|gb|AAK34704.1 1 
gi|1 3623055|gb|AAK34721 .1 1 
gi|1 3623060|gb|AAK34725. 1 1 
gi|13623061 |gblAAK34726.1 1 
gl|13623063|gb|AAK34728.1 1 
gil13623066|gb|AAK34731.1| 
gi|1 3623068|gb|AAK34732.1 1 
gi|1 3623092|gb|AAK34754. 1 1 
gi|1 3623097|gb|AAK34758.1 1 
gi|1 36231 04|gb|AAK34765.1 1 
gi|13623126|gb|AAK34785.1| 
gi|13623130|gb|AAK34789.1| 
gi|13623137|gb|AAK34795.1| 
gi|13623153|gb|AAK3481 0.1 1 
gi|1 3623164|gb|AAK34820.1 1 
gi|13623178|gb|AAK34833.1l 
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Table 35: GAS ORF's which ar shared with pneum coccus 

but not with GBS 



gi|13621 338|gb|AAK33157.1 1 
gi|1 3621 352|gb|AAK331 68.1 1 
gi|13621410|gblAAK33221.1| 
gi|13621433|gb|AAK33242.1| 
gi|13621445|gb|AAK33253.1| 
gi| 1 3621 446|gb| AAK33254. 1 1 
gi|1 3621 447|gb|AAK33255.1 1 
gi|1 3621448|gb|AAK33256.1 1 
gi|1 3621449|gb|AAK33257.1 1 
gi|1 3621451 |gb|AAK33259. 1 1 
gi|13621460|gb|AAK33267.1| 
gi|13621466|gb|AAK33272.1 1 
gi|13621489|gb|AAK33293.1| 
gi|1 3621 490|gb|AAK33294.1 1 
gi| 1 3621 51 9|gb| AAK33320. 1 1 
gi|1 3621 520|gb|AAK33321 . 1 1 
gi|13621653|gblAAK33443.1 1 
gi|13621722|gb|AAK33506.1 1 
gi| 1 3621 723|gb|AAK33507.1 1 
gi|13621724|gb|AAK33508.1| 
gi|13621805|gb|AAK33582.1| 
gi|13621900|gb|AAK33669.1| 
gi|13622011|gbIAAK33769.1| 
gi|13622212|gb|AAK33953.1 1 
gi|13622280|gb|AAK34016.1| 
gi|1 3622381 |gb|AAK341 08. 1 1 
gi|13622409Jgb|AAK341 34.1 1 
gi|1 362241 0|gb|AAK341 35. 1 1 
gi|13622423|gb|AAK34146.1| 
gi|1 3622428|gb|AAK341 51 .1 1 
gi|13622441|gb|AAK34162.1| 
gi|13622442|gb|AAK34163.1| 
gi|1 3622454|gb|AAK341 74. 1 1 
gi|1 3622456|gb|AAK341 76. 1 1 
gi| 1 362261 9|gb|AAK34325.1 1 
gi| 1 3622642|gb| AAK34346. 1 1 
gi|13622643|gb|AAK34347,1| 
gi|13622664|gb|AAK34366.1| 
gi|1 3622666|gb|AAK34368. 1 1 
gi|13622667|gb|AAK34369.1| 
gi|1 3622671 |gb|AAK34372. 1 1 
gi|1 3622672|gb|AAK34373. 1 1 
gi|1 3622673|gb|AAK34374.1 1 
gi|1 3622674|gb|AAK34375. 1 1 
gi|1 3622679|gb|AAK34380. 1 1 
gi|1 3622680|gb|AAK34381 . 1 1 
gi|1 3622682|gb| AAK34382. 1 1 
gi|1 3622755|gb|AAK34448. 1 1 
gi|13622758|gb|AAK34450.1| 
gi|1 3622759|gb|AAK34451 . 1 1 
gi|1 3622835|gb|AAK34521 . 1 1 
gi|1 3622837|gblAAK34523. 1 1 
gi|1 3622937|gb|AAK34614.1 1 
gi|13622942|gb|AAK34618.1| 
gi| 1 3622946|gb|AAK34622. 1 1 
gi|1 3622978|gb| AAK34652. 1 1 



gi|13623027|gb|AAK34695.1| 
gi| 1 3623087|gb|AAK34749. 1 1 
gi|1 3623 101 |gb|AAK34762. 1 1 
gi|1 3623144|gb|AAK34802.1 1 
gi| 1 3623 146|gb|AAK34804. 1 1 
gi|13623147|gb|AAK34805.1| 




Table 36: Spn ORF's are shared with GBS and GAS 



SP0001 

SP0002 

SP0003 

SP0004 

SP0005 

SP0006 

SP0007 

SP0008 

SP0010 

SP0011 

SP0013 

SP0014 

SP0019 

SP0021 

SP0024 

SP0027 

SP0032 

SP0033 

SP0034 

SP0035 

SP0036 

SP0037 

SP0042 

SP0044 

SP0045 

SP0046 

SP0047 

SP0048 

SP0051 

SP0053 

SP0054 

SP0056 

SP0C63 

SP0073 

SP0074 

SP0078 

SP0079 

SP0083 

SP0084 

SP0085 

SP0095 

SP0105 

SP0106 

SP0111 

SP0112 

SP0118 

SP0120 

SP0121 

SP0122 

SP0127 

SP0128 

SP0129 

SP0148 

SP0149 

SP0151 

SP0152 



SP0158 

SP0173 

SP0179 

SP0180 

SP0184 

SP0185 

SP0186 

SP0187 

SP0189 

SP0192 

SP0194 

SP0197 

SP0199 

SP0202 

SP0204 

SP0205 

SP0208 

SP0209 

SP0210 

SP0211 

SP0212 

SP0213 

SP0214 

SP0215 

SP0216 

SP0217 

SP0218 

SP0219 

SP0220 

SP0221 

SP0222 

SP0224 

SP0225 

SP0226 

SP0227 

SP0228 

SP0229 

SP0230 

SP0231 

SP0232 

SP0233 

SP0234 

SP0235 

SP0236 

SP0240 

SP0242 

SP0243 

SP0245 

SP0246 

SP0247 

SP0248 

SP0249 

SP0250 

SP0251 

SP0252 

SP0253 



SP0254 

SP0259 

SP0261 

SP0262 

SP0263 

SP0264 

SP0265 

SP0266 

SP0268 

SP0271 

SP0272 

SP0273 

SP0274 

SP0280 

SP0281 

SP0282 

SP0283 

SP0284 

SP0285 

SP0286 

SP0287 

SP0289 

SP0290 

SP0291 

SP0292 

SP0294 

SP0295 

SP0303 

SP0310 

SP0314 

SP0317 

SP0318 

SP0319 

SP0320 

SP0321 

SP0322 

SP0323 

SP0324 

SP0325 

SP0327 

SP0330 

SP0334 

SP0336 

SP0337 

SP0338 

SP0340 

SP0342 

SP0369 

SP0370 

SP0371 

SP0373 

SP0374 

SP0381 

SP0382 

SP0383 

SP0384 



SP0385 

SP0386 

SP0387 

5P0400 

SP0401 

SP0402 

SP0403 

SP0404 

SP0405 

SP0406 

SP0408 

SP0410 

SP0411 

SP0412 

SP0415 

SP0416 

SP0417 

SP0418 

SP0419 

SP0420 

SP0421 

SP0422 

SP0423 

SP0424 

SP0425 

SP0426 

SP0427 

SP0433 

SP0434 

SP0435 

SP0436 

SP0437 

SP0438 

SP0439 

SP0441 

SP0442 

SP0443 

SP0452 

SP0453 

SP0454 

SP0457 

SP0458 

SP0459 

SP0461 

SP0466 

SP0467 

SP0474 

SP0477 

SP0478 

SP0483 

SP0486 

SP0488 

SP0489 

SP0493 

SP0494 

SP0499 
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Tabl 36: Spn ORF's are shared with GBS and GAS 



SP0500 

SP0501 

SP0502 

SP0515 

SP0516 

SP0517 

SP0519 

SP0521 

SP0522 

SP0523 

SP0526 

SP0549 

SP0550 

SP0552 

SP0553 

SP0554 

SP0555 

SP0556 

SP0557 

SP0563 

SP0567 

SP0568 

SP0576 

SP0577 

SP0578 

SP0579 

SP0581 

SP0588 

SP0589 

SP0591 

SP0592 

SP0593 

SP0603 

SP0604 

SP0605 

SP0608 

SP0610 

SP0611 

SP0613 

SP0614 

SP0615 

SP0616 

SP0618 

SP0620 

SP0622 

SP0623 

SP0624 

SP0626 

SP0630 

SP0631 

SP0636 

SP0637 

SP0638 

SP0645 

SP0646 

SP0647 



SP0652 

SP0657 

SP0660 

SP0662 

SP0663 

SP0665 

SP0668 

SP0669 

SP0671 

SP0672 

SP0673 

SP0674 

SP0675 

SP0676 

SP0678 

SP0680 

SP0681 

SP0687 

SP0688 

SP0689 

SP0690 

SP0701 

SP0702 

SP0709 

SP0713 

SP0726 

SP0727 

SP0729 

SP0735 

SP0736 

SP0741 

SP0744 

SP0745 

SP0746 

SP0756 

SP0757 

SP0758 

SP0760 

SP0761 

SP0762 

SP0764 

SP0765 

SP0766 

SP0767 

SP0768 

SP0770 

SP0771 

SP0775 

SP0776 

SP0778 

SP0779 

SP0780 

SP0782 

SP0784 

SP0785 

SP0786 



SP0787 

SP0788 

SP0792 

SP0793 

SP0797 

SP0798 

SP0799 

SP0801 

SP0802 

SP0803 

SP0805 

SP0806 

SP0807 

SP0816 

SP0817 

SP0820 

SP0822 

SP0823 

SP0824 

SP0825 

SP0828 

SP0829 

SP0831 

SP0835 

SP0837 

SP0838 

SP0839 

SP0841 

SP0843 

SP0844 

SP0845 

SP0846 

SP0847 

SP0848 

SP0851 

SP0852 

SP0855 

SP0856 

SP0862 

8P0864 

SP0865 

SP0867 

SP0868 

SP0869 

SP0870 

SP0871 

SP0872 

SP0873 

SP0875 

SP0876 

SP0877 

SP0878 

SP0880 

SP0881 

SP0893 

SP0894 



SP0895 

SP0896 

SP0897 

SP0904 

SP0905 

SP0908 

SP0909 

SP0912 

SP0923 

SP0927 

SP0928 

SP0929 

SP0931 

SP0932 

SP0933 

SP0935 

SP0936 

SP0937 

SP0938 

SP0943 

SP0944 

SP0945 

SP0946 

SP0947 

SP0948 

SP0954 

SP0955 

SP0959 

SP0960 

SP0961 

SP0982 

SP0964 

SP0966 

SP0967 

SP0968 

SP0969 

SP0970 

SP0971 

SP0972 

SP0974 

SP0975 

SP0976 

SP0978 

SP0979 

SP0980 

SP0981 

SP0984 

SP0985 

SP0987 

SP0988 

SP0989 

SP0991 

SP0992 

SP0993 

SP1002 

SP1003 
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Table 36: Spn ORF's are shared with 6BS and GAS 



SP1004 

SP1008 

SP1010 

SP1012 

SP1016 

SP1017 

SP1018 

SP1020 

SP1021 

SP1022 

SP1024 

SP1025 

SP1026 

SP1029 

SP1033 

SP1034 

SP1035 

SP1045 

SP1056 

SP1067 

SP1068 

SP1069 

SP1070 

SP1071 

SP1072 

SP1073 

SP1074 

SP1076 

SP1079 

SP1081 

SP1082 

SP1083 

SP1084 

SP1087 

SP1088 

SP1089 

SP1090 

SP1093 

SP1094 

SP1095 

SP1096 

SP1097 

SP1098 

SP1099 

SP1100 

SP1102 

SP1105 

SP1106 

SP1107 

SP1110 

SP1111 

SP1112 

SP1113 

SP1114 

SP1115 

SP1116 



SP1117 

SP1118 

SP1119 

SP1128 

SP1151 

SP1152 

SP1155 

SP1156 

SP1157 

SP1159 

SP1160 

SP1161 

SP1162 

SP1163 

SP1164 

SP1167 

SP1168 

SP1169 

SP1174 

SP1175 

SP1176 

SP1177 

SP1178 

SP1179 

SP1180 

SP1182 

SP1184 

SP1185 

SP1187 

SP1190 

SP1191 

SP1192 

SP1193 

SP1197 

SP1200 

SP1202 

SP1204 

SP1205 

SP1207 

SP1208 

SP1212 

SP1213 

SP1218 

SP1219 

SP1220 

SP1225 

SP1226 

SP1227 

SP1228 

SP1229 

SP1230 

SP1231 

SP1232 

SP1233 

SP1238 

SP1241 



SP1242 

SP1244 

SP1245 

SP1246 

SP1247 

SP1248 

SP1249 

SP1260 

SP1263 

SP1266 

SP1275 

SP1276 

SP1277 

SP1278 

SP1279 

SP1280 

SP1283 

SP1284 

SP1285 

SP1286 

SP1287 

SP1288 

SP1289 

SP1290 

SP1291 

SP1293 

SP1297 

SP1298 

SP1299 

SP1308 

SP1316 

SP1324 

SP1329 

SP1330 

SP1331 

SP1336 

SP1341 

SP1354 

SP1355 

SP1357 

SP1358 

SP1359 

SP1362 

SP1368 

SP1370 

SP1371 

SP1372 

SP1374 

SP1375 

SP1376 

SP1377 

SP1378 

SP1380 

SP1381 

SP1383 

SP1386 



SP1387 

SP1388 

SP1389 

SP1390 

SP1393 

SP1394 

SP1395 

SP1396 

SP1397 

SP1398 

SP1399 

SP1400 

SP1402 

SP1403 

SP1404 

SP1405 

SP1406 

SP1407 

SP1408 

SP1409 

SP1411 

SP1412 

SP1413 

SP1414 

SP1415 

SP1416 

SP1420 

SP1421 

SP1427 

SP1428 

SP1429 

SP1434 

SP1435 

SP1445 

SP1446 

SP1448 

SP1449 

SP1450 

SP1452 

SP1453 

SP1456 

SP1457 

SP1458 

SP1460 

SP1461 

SP1462 

SP1465 

SP1466 

SP1469 

SP1470 

SP1473 

SP1474 

SP1475 

SP1478 

SP1479 

SP1482 
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Table 36: Spn ORF's are shared with GBS and GAS 



SP1483 

SP1485 

SP1489 

SP1491 

SP1498 

SP1500 

SP1501 

SP1502 

SP1504 

SP1505 

SP1507 

SP1508 

SP1509 

SP1510 

SP1511 

SP1512 

SP1513 

SP1517 

SP1518 

SP1519 

SP1521 

SP1522 

SP1523 

SP1529 

SP1530 

SP1534 

SP1535 

SP1536 

SP1537 

SP1538 

SP1539 

SP1540 

SP1541 

SP1542 

SP1544 

SP1547 

SP1549 

SP1551 

SP1552 

SP1553 

SP1554 

SP1557 

SP1558 

SP1559 

SP1560 

SP1561 

SP1563 

SP1564 

SP1565 

SP1566 

SP1568 

SP1569 

SP1571 

SP1574 

SP1575 

SP1577 



SP1580 

SP1583 

SP1584 

SP1586 

SP1587 

SP1588 

SP1589 

SP1590 

SP1591 

SP1597 

SP1598 

SP1599 

SP1602 

SP1603 

SP1606 

SP1608 

SP1609 

SP1610 

SP1615 

SP1616 

SP1617 

SP1624 

SP1625 

SP1626 

SP1631 

SP1633 

SP1638 

SP1644 

SP1645 

SP1646 

SP1647 

SP1648 

SP1649 

SP1650 

SP1652 

SP1653 

SP1655 

SP1659 

SP1661 

SP1662 

SP1664 

SP1665 

SP1666. 

SP1667 

SP1668 

SP1670 

SP1671 

SP1672 

SP1674 

SP1675 

SP1676 

SP1677 

SP1681 

SP1682 

SP1683 

SP1684 



SP1685 

SP1688 

SP1689 

SP1697 

SP1698 

SP1699 

SP1702 

SP1709 

SP1711 

SP1712 

SP1713 

SP1714 

SP1717 

SP1721 

SP1722 

SP1724 

SP1725 

SP1726 

SP1727 

SP1732 

SP1733 

SP1734 

SP1735 

SP1736 

SP1737 

SP1738 

SP1739 

SP1742 

SP1743 

SP1744 

SP1746 

SP1747 

SP1748 

SP1749 

SP1750 

SP1752 

SP1759 

SP1776 

SP1780 

SP1781 

SP1782 

SP1785 

SP1790 

SP1795 

SP1799 

SP1804 

SP1816 

SP1817 

SP1825 

SP1839 

SP1840 

SP1845 

SP1847 

SP1848 

SP1851 

SP1855 



SP1857 
SP1858 
SP1860 
SP1861 
SP1865 
SP1871 
SP1873 
SP1874 
SP1875 
SP1876 
SP1877 
SP1878 
SP1879 
SP1880 
SP1881 
SP1883 
SP1884 
SP1887 
SP1888 
SP1889 
SP1890 
SP1895 
SP1896 
SP1900 
SP1901 
SP1902 
SP1903 
.SP1906 
SP1908 
SP1909 
SP1916 
SP1918 
SP1922 
SP1940 
SP1942 
SP1944 
SP1953 
SP1957 
SP1960 
SP1961 
SP1963 
SP1964 
SP1966 
SP1967 
SP1968 
SP1969 
SP1970 
SP1972 
SP1973 
SP1974 
SP1975 
SP1976 
SP1979 
SP1980 
SP1981 
SP1982 
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Table 36: Spn ORF's ar shared with GBS and GAS 



SP1983 

SP1984 

SP1985 

SP1987 

SP1989 

SP1990 

SP1991 

SP1993 

SP1994 

SP1996 

SP1997 

SP1998 

SP1999 

SP2006 

SP2007 

SP2010 

SP2011 

SP2012 

SP2020 

SP2021 

SP2022 

SP2027 

SP2028 

SP2030 

SP2031 

SP2032 

SP2033 

SP2034 

SP2035 

SP2036 

SP2037 

SP2038 

SP2040 

SP2041 

SP2042 

SP2044 

SP2045 

SP2048 

SP2052 

SP2053 

SP2054 

SP2055 

SP2056 

SP2057 

SP2058 

SP2063 

SP2065 

SP2069 

SP2070 

SP2072 

SP2073 

SP2075 

SP2077 

SP2078 

SP2082 

SP2083 



SP2085 

SP2086 

SP2087 

SP2088 

SP2090 

SP2091 

SP2092 

SP2094 

SP2099 

SP2100 

SP2101 

SP2106 

SP2107 

SP2108 

SP2109 

SP2110 

SP2112 

SP2113 

SP2114 

SP2119 

SP2121 

SP2129 

SP2131 

SP2135 

SP2142 

SP2148 

SP2150 

SP2151 

SP2152 

SP2153 

SP2156 

SP2161 

SP2162 

SP2169 

SP2170 

SP2171 

SP2172 

SP2173 

SP2174 

SP2175 

SP2176 

SP2184 

SP2185 

SP2186 

SP2187 

SP2188 

SP2189 

SP2191 

SP2192 

SP2193 

SP2194 

SP2195 

SP2202 

SP2203 

SP2204 

SP2205 



SP2206 

SP2207 

SP2208 

SP2209 

SP2210 

SP2214 

SP2215 

SP2216 

SP2219 

SP2220 

SP2221 

SP2222 

SP2224 

SP2225 

SP2226 

SP2227 

SP2228 

SP2229 

SP2230 

SP2231 

SP2233 

SP2234 

SP2235 

SP2238 

SP2239 

SP2240 
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Tabl 37: Spn ORPs which are shared with GBS but not with GAS 



SP0012 


SP0725 


SP0020 


SP0730 


SP0039 


SP0739 


SP0050 


SP0749 


SP0082 


SP0750 


SP0107 


SP0751 


SP0113 


SP0752 


SP0119 


SP0753 


SP0146 


SP0754 


SP0150 


SP0769 


SP0175 


SP0789 


SP0176 


SP0791 


SP0177 


SP0826 


SP0178 


SP0900 


SP0237 


SP0913 


SP0255 


SP0914 


SP0260 


SP0939 


SP0267 


SP0941 


SP0278 


SP0942 


SP0288 


SP0953 


SP0346 


SP0973 


SP0347 


SP0977 


SP0348 


SP1011 


SP0349 


SP1013 


SP0366 


SP1027 


SP0376 


SP1054 


SP0413 


SP1 055 


SP0445 


SP1080 


SP0462 


SP1086 


SP0463 


SP1121 


SP0479 


SP1122 


SP0480 


SP1123 


SP0482 


SP1124 


SP0484 


SP1126 


SP0537 


SP1127 


SP0538 


SP1137 


SP0566 


SP1166 


SP0580 


SP1173 


SP0585 


SP1194 


SP0599 


SP1195 


SP0600 


SP1215 


SP0601 


SP1240 


SP0606 


SP1256 


SP0607 


SP1261 


SP0609 


SP1271 


SP0617 


SP1272 


SP0627 


SP1273 


SP0655 


SP1274 


SP0656 


SP1306 


SP0710 


SP1310 


SP0711 


SP1332 


SP0717 


SP1333 


SP0718 


SP1334 


SP0720 


SP1346 


SP0723 


SP1348 


SP0724 


SP1350 



SP1360 


SP1927 


SP1361 


SP1928 


SP1365 


SP1943 


SP1382 


SP1959 


SP1384 


SP2001 


SP1392 


SP2002 


SP1447 


SP2009 


SP1451 


SP2026 


SP1463 


SP2029 


SP1464 


SP2039 


SP1471 


SP2061 


SP1472 


SP2064 


SP1524 


SP2066 


SP1527 


SP2079 


SP1600 


SP2084 


SP1605 


SP2095 


SP1607 


SP2096 


SP1632 


SP2098 


SP1634 


SP2103 


SP1651 


SP2127 


SP1673 


SP2128 


SP1680 


SP2130 


SP1695 


SP2134 


SP1700 


SP2137 


SP1701 


SP2138 


SP1720 


SP2157 


SP1729 


SP2196 


SP1740 




SP1741 




SP1745 




SP1751 




SP1757 




SP1758 




SP1761 




SP1762 




SP1763 




SP1764 




SP1765 




SP1766 




SP1767 




SP1768 




SP1770 




SP1771 




SP1772 




SP1783 




SP1802 




SP1828 




SP1856 




SP1867 




SP1869 




SP1870 




SP1872 




SP1891 




SP1907 




SP1910 




SP1911 






Table 38: Spn ORF's which ar shared with GAS but no with GBS 



SP0065 


SP1754 


SP0075 


SP1797 


SP0090 


SP1798 


SP0091 


SP1800 


SP0092 


SP1885 


SP0099 


SP1919 


SP01G0 


SP1923 


SP0153 


SP1941 


SP0155 


SP1950 


SP0156 


SP2016 


SP0200 


SP2017 


SP0306 


SP2051 


SP0313 


SP2060 


SP0341 


SP2111 


SP0476 


SP2143 


SP0496 


SP2144 


SP0509 


SP2201 


SP0527 


SP2236 


SP0648 




SP0658 




SP0659 




SP0661 




SP0677 




SP0715 




SP0742 




SP0743 




SP0858 




SP0859 




SP0860 




SP0910 




SP0986 




SP0994 




SP0999 




SP1000 




SP1001 




SP1023 




SP1075 




SP1129 




SP1147 




SP1171 




SP1186 




SP1315 




SP1317 




SP1319 




SP1320 




SP1321 




SP1322 




SP1438 




SP1442 




SP1525 




SP1646 




SP1570 




SP1572 




SP1578 




SP1604 




SP1715 
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